Toil and trouble

My worst fears have come to pass and for a while I thought the LEO-1 was not going to make it. I had made a test board to act as a fake control board so that I could test the memory board. I had tested it with the Real Time Clock chip and everything was working great. So I carried on building the memory board and connected up the two ZIF sockets. I burned some test data to one of the EEPROMs and put it in one of the sockets, connected up a couple of digits to one half the data bus and used the test board to address the ROM. Lo and behold, the digits showed the correct data coming out of the ROM. It was working as expected. Here’s a picture of the set-up. The digits on the protoboard are showing 3 4 which is the first byte in the ROM.

Memory Board working

Memory board displaying data

I switched various addresses and watched the data come out and everything seemed fine, when suddenly the digits went all weird, like a square 8 which is an impossible display. It was pretty clear that the display was oscillating between two values at such a high speed that it looked like all the dots were on at the same time.

And so began several weeks of pure hell trying to figure out what was going on and what caused it. I was able to reliably repeat the problem by setting a certain address and then changing to the next address so that the data changed from 0 0 to  0 1. Often, but not always, the thing would start oscillating. I put my scope on it and found an 8Mhz oscillation on the data bus. I suspected the ROM was faulty but all the ROMs had the same behaviour. I suspected the test board because I had forgotten to buffer the address bus lines. That was a dumb mistake which I decided to fix, so I spent a week adding 74244 buffers to the test board. That didn’t fix the problem. Then I realised I had forgotten to buffer the control signals too, so I used a few spare gates on the board’s 7400 to do that. Didn’t fix it. Weeks had gone by and this random oscillation was still happening. I remembered that you can get this kind of problem if you forget to connect an unused input on a gate, so I checked the schematics and scoured the board for disconnected pins. No luck. Everything was as it should be. One night in early December as I sat there with my head in my hands, I realised that if I couldn’t solve this, there would be no chance of the LEO-1 working at all. My fairly trivial memory board wasn’t even stable at manual switching speeds! I thought about giving up — all that time, effort and money down the drain. By the time I went to bed, I was pretty depressed.


Trying to figure out the oscillation problem

The next day I took one last crack at solving the problem. I had been studying all the data signals on my scope’s logic analyzer and had not had any ideas beyond unconnected pins and faulty chips.

You can see in the picture above that the test board’s ribbon cable was sticking up in the air. It was like that because I was using a short cable for the address bus and the long data bus cable had to be bent in order to be plugged in. Well, for some reason, while the oscillation problem was happening, I just happened to push the cable a bit flatter — and the oscillation stopped. That didn’t seem like a coincidence, so I tried changing the address with the cable flat — no oscillation. I bent the cable again and then next time I switched addresses, the oscillation started again. I could control whether the oscillation would happen or not by bending and straightening the cable. What’s more, the scope showed that the oscillation frequency was changing between about 7 and 8 MHz depending on how the cable was folded.

Then it hit me. Was the ribbon cable suffering from transmission line problems? Those problems that I had been obsessing over right back at the start? Things like reflections and ringing and all that annoying stuff? And did folding the cable do something random like perhaps change the impedance or cause cross-talk? And could that make the bus transceiver chip go into some kind of oscillation? Well, I was not sure and I’m still not sure now, but I have a theory. When reading data out of the ROM, the test-board end of the 12″ cable was not connected to anything. That meant that when the data bus switched from 00 to 01, that 1 bit went down to the other end of the 12″ and reflected back. Ordinarily I would have expected the reflection to die out pretty quickly and just cause a bit of ringing. But something about the folded cable caused it to keep bouncing, perhaps amplified by the bus driver itself. After reading up a bit about line termination, I did an experiment. I added a bunch of 82 ohm resistors at the memory board end of the cable, just before the 74245 transceiver. After that, I couldn’t repeat the problem any more. No amount of switching addresses or folding the cable in any way caused the oscillation and I haven’t seen the problem since. I don’t really understand why it works except that possibly the resistors absorb the reflection and damp it so that it doesn’t turn into an oscillation, but I still don’t understand why I got full permanent oscillation in the first place. I’d love to hear from any experts out there who can explain what the heck was going on.

As if that wasn’t enough trouble, the next thing that went wrong was that I discovered that I had not left nearly enough room between the address bus header and the first bunch of chips. There was no room left to solder any more wires and I still had a couple of busses to connect there. I found I had no option but to move the address bus header to the other side of the board. I had to disconnect all the wires, desolder the header, re-solder it and redo all the work of connecting the 24 address lines to the ZIF sockets. While I was at it, I bit the bullet and removed the majority of the thick annoying wires and replaced them all with magnet wire. I learned a valuable lesson there. You have to leave plenty of room between headers and the chips they connect to so that there’s room to solder multiple busses to the same header pins. From now on, I’m only going to use regular wire for power connections. Everything else will be done with thin magnet wire.

Address Bus

Address Bus connections

So, the LEO-1 lives on after all and right now I’m in the process of wiring up the main RAM and ROM sockets. After that it’s the I/O chips and then board number 1 of 4 is complete.

Memory Board

Memory board as of New Year 2016


Prior planning and preparation…

I haven’t updated this for ages and anyone reading it would be forgiven for thinking I had given up on the LEO-1. Nothing could be further from the truth. I finally got some answers about the worrying stuff I mentioned earlier, partly from some friendly guys on the Electronics Point forum. You can read the thread here.

To cut a long story short, it seems I have been overthinking this issue a little bit too much. I shouldn’t run into any trouble at the speeds my circuit will be running at. Some of the spiky stuff I was worried about even seems to be generated by the act of measuring it, due to reflections inside the scope’s probes. Some of it is also caused by doing tests on a breadboard with long ratty wires all over the place. When I build the real thing on real boards with short wires, it should be fine. I’m going to go on that assumption for now.

During this time I’ve also been figuring out what other parts I’ll be needing and getting them together. You may recall that I was worried that HCT parts were not the best parts to use and I actually did decide to back up on that and switch over to HC parts. It just wasn’t worth the risk of buying hundreds of chips only to find they don’t work the way I expected. So I counted my losses and reordered the original prototyping parts in HC. I now have a bunch of HCT chips that I won’t use but I’ll hold on to them for a rainy day. Once I had figured out what I was going to need, I ordered a ton of chips. There’s a company on eBay that sells unused surplus parts amazingly cheaply. For example, I was able to get about fifty 74HC32s for about $7. The rest of the stuff I’ve been getting from Mouser and some (like the circuit boards) from DigiKey. The EEPROM I’ve chosen is the Greenliant GLS29EE010 which is a 1Mbit device organised as 128 x 8 bits. They only cost $2 each; two of those in parallel and I’ve got a 16-bit ROM for the monitor program. At this point I decided I was going to have to get a reliable EEPROM programmer. I’d seen cheap Chinese device programmers on eBay but I’d also read appalling things about their reliability and usability. It sounded like a false economy that I couldn’t risk. Perhaps when I was a poor teenager but not now. So I bought a Phyton ChipProg 40, mainly because it has the GLS29EE010 on its supported device list, but also because it was available, and I could afford it. I’m happy to report that it works perfectly and I was able to burn some test garbage into my ROM chips. I was also able to use it to have a look at the old PICs that I’d programmed in 2008 on a PIC development board. The code was all still there. An amazing thing is Flash memory.

In other news, I wanted to have some red LED digits on the front panel for debugging and discovered some really nice smart hex display chips (HP 5082 7340) — but they turned out to be obsolete, very expensive and difficult to get. They look so nice that I don’t know why they would be obsolete. I’ve never seen these kind of things on any equipment before and wonder where they were used. Everything has the ubiquitous seven-segment displays, but not these. The last time I saw anything like them was on my first digital watch in 1978, but they were much tinier. Anyway, I found something similar, TIL311, on eBay and acquired four of them. That’s enough to display a 16-bit value. Here’s a couple of pictures:

TIL311 smart display

TIL311 smart display

TIL311 in action

TIL311 in action

When I haven’t been experimenting with the actual parts, I’ve been drawing the schematics. So far I have drawn two of the four boards and I’ve been finding design flaws while doing it. As soon as I started drawing with real components I noticed I had missed a line driver here and there. I also found a potential race hazard that meant I had to revise the simulation. I hadn’t realised that the memory address decoding will take a finite amount of time to settle and that during that time, it will be possible to select multiple devices onto the data bus. If that happens even for 10 nanoseconds, it won’t be good for the devices or the power consumption, not to mention the stability of the machine. The solution is to wait an extra tick for the decoding to finish and only then actually assert the chip select signal. When I spotted this I found another similar issue and realised my instruction cycle of only 4 states was too simple. I had to increase it to 8 states for memory operations and 5 states for non-memory operations. Very disappointing, but makes sense since I haven’t seen any other designs out there with only 4 states for an instruction. This means that all instructions no longer take the same amount of time to execute which seems a bit weird. Still, I think it will work just fine.

I also figured it would be nice to have a means to switch off the main clock and be able to single step instructions with a button. I spent some time experimenting with ways to achieve that and added it to the schematic for the clock section. During this time I revisited the 555 timer, a familiar friend from my early digital learning days. I still have my old ‘Babani’  book IC 555 Projects by the very drole Mr. E.A. Parr B.Sc, C.Eng, M.I.E.E — that’s a lot of letters 🙂 In the end, I used a 555 for the ‘slow’ clock (a crystal oscillator will be used for the ‘fast’ or normal clock) and didn’t need one for the single step circuit.

Prototyping single step

Prototyping single step

I smell a mistake

Using the PB-503 proto-board I did a bit of testing with the 74HCT parts that I had ordered. Things worked very well. I got a 4-bit counter going and was able to feed 4-bit data into a register and use the bus drivers to get the data out. Doing it in slow-motion (1 to 5 Hz clock) I could see the data was correct on a row of LEDs. Then I did something that I have never done with digital chips before; I connected an output to my oscilloscope, and my jaw hit the ground. This is what I saw:

Output of 74HCT chip at 100KHz

Output of 74HCT chip at 100KHz

Now I didn’t get into university to study electronics like I wanted to, I’m self-taught by experimentation and the Internet, so I didn’t get a very rounded education. I always thought digital signals were… well… digital. Like, square waves. That thing on my screen was not a proper square wave. It had spikey things in it. What troubled me was the max and min voltage readings. Almost seven volts from a 5V power supply? Where was that coming from? And what the hell was that -1.68V negative spike? I had used bypass capacitors like you’re meant to. I never really understood why you needed them but always did it anyway. Wasn’t that meant to prevent this kind of thing? Just for fun I increased the clock to 1MHz and got this:

Output of 74HCT chip at 1MHz

Output of 74HCT chip at 1MHz

This made my blood run cold. How was this possible? If you are an experienced digital electronics engineer, you are laughing at me because you know why this was happening. Well, I didn’t, and I had to find out. Zooming in, I recognised the effect as a kind of ‘ringing’ which I am used to seeing coming out of my analogue synth – but my synth is meant  to do that; it has filters in it designed to muck up your square wave so it sounds cool. I want my digital square waves to be perfect. Why was there analogue stuff in them? I Googled something like “TTL output ringing” and over the next few days I read a lot about things I had never even dreamed of existed. ‘Ringing’, ‘ground bounce’, ‘noise’, ‘crosstalk’, ‘stray capacitance’ and a few other horrors that I forget. I also found out why you need bypass caps, which is nothing to do with this ringing problem. I realised I was in for a ton of random problems if I didn’t learn how to avoid those things. It seems you can’t just plug a load of digital stuff into each other without considering all kinds of weird analogue stuff that can happen to your signals. But I had done that. I had built digital stuff before, clocks and things. I had no scope to show me scary stuff and I always visualised the pulses as perfect square waves. What a poor fool I had been all these years. But wait — my circuits had worked, right? So my circuit would still work; just pretend I’d never looked at the waveform and move on. But I couldn’t. That would be like me pretending my variables were properly initialised in my code. No way, I couldn’t ignore this. I was going to have to find out and try to follow all the established rules for minimising this and any other horrors that were waiting for me. After I calmed down a bit it started to seem rather straightforward. Keep wires as short as possible, don’t run signal wires too close to each other, separate ribbon cable signals with interleaved ground signals, and so on. But this test was on a single output. It wasn’t crosstalk, it was most likely an impedance mismatch from what I could understand. Apparently fixable with resistors. But you don’t see tons of resistors in digital circuits preventing this kind of thing so that couldn’t be it either.

After a while I found that this particular ringing issue is probably nothing much to worry about. I’ve read other people’s CPU blogs and no one seems to care or mention this stuff. I’m probably over-analysing it as I have a tendency to do. For one thing, the evil-looking negative spike is dealt with by internal diodes in the chip inputs. I’m still not sure how the over-supply spike is handled by the chip, if at all. Still, maybe I’ll just have to ignore it and follow best practises.

My reading led me to find out a lot more things that I wouldn’t have known. Without some understanding of these issues I might have ended up with a CPU that kind of worked a bit, sometimes, at slow clock speeds or something equally useless. At least now I have a fighting chance of making it work properly. But all the reading led me to a new issue, the issue of choosing HCT chips over HC. I read a paper by Texas Instruments entitled SN54/74HCT CMOS Logic Family Applications and Restrictions and found, at the end, the following: “…employing HCT instead of HC devices in pure CMOS systems cannot be recommended. […] Due to the lower noise margin, there is an increased risk of interference caused by crosstalk, especially when the lines on the printed circuit board exceed a certain length. Moreover, the reduced switching threshold no longer ensures faultless operation of advanced bus systems used in microprocessor applications today.” I started to think I had made a mistake choosing HCT parts, as I had suspected earlier. My decision was stupid. I should have just found out if I could get all the parts in HC and then built the whole thing with HC instead of wondering if I would be able to, or if I would need to fall back on LS parts. I’m kind of troubled by how I let this happen. It’s not like me to plan something so badly. I had become a bit excited and got carried away. Not the sign of a good engineer. I had to stop and think about this all some more. I went to bed and slept on it.