Issue # 1 DTACK GROUNDED Newsletter - July 1981
Copyright 1981 Digital Acoustics, Inc.
ABOUT OUR LOGO: (you should immediately skip to the 4th paragraph if you are not a hardware hacker). If you do not recognize the meaning of our logo, you obviously have not reviewed Motorola's application notes for their 68000. The 68000 features an asynchronous data bus, like many of the more complex minicomputers. Assuming that you are approaching the 68000 by way of an 8 bit microprocessor such as a 6502, 6800, or 8080 this will be strange to you because these all have synchronous data buses. The 68000, when writing data, places the data on the bus and waits for an acknowledgement that the data has been received. This is accomplished by placing a low logic level on pin 10, DTACK (DaTa ACKnowledged). A read is performed by requesting data from a device and waiting until a low logic level is placed on pin 10 which acknowledges that the data is now available on the bus.
Because the acknowledgements from all of the various devices connected to the bus must converge on pin 10, a very large percentage of the available application information on the 68000 published by Motorola covers the circuitry to generate DTACK. Motorola has thoughtfully provided another pin labelled BERR (Bus ERRor). When DTACK is not successfully generated, you are supposed to provide a watchdog timer which will assert BERR logic low. The 68000 will then trap to the location in memory pointed to by the BERR vector. At this location, you have to write software to decide what to do about the failure to generate DTACK.
Motorola is not worried about the complexity of this process because they are, for reasons known only to Motorola management, promoting the 68000 exclusively as the "engine" for very complex systems. Accordingly, there are a lot of companies who are planning to drive the PDP 11/70 out of the marketplace with $10,000 (base price) 68000 systems. It is rumored that Apple is one of these companies, and HP has already announced their system, which naturally comes with a 50 (fifty) column CRT.
If you are interested in paying $10,000 for an admittedly high performance, complex system based on the 68000, this would be a very good time to stop reading this newsletter. There will be absolutely nothing here to interest you. If you already own, or are thinking of buying, an Apple II or a Pet/CBM system, and you would like to inexpensively increase its computational throughput by a factor of 10 to 20, read on.
Again, about the logo: it is possible to build very simple systems using the 68000 if you ground DTACK. Naturally, you can then tie BERR to +5, since the data is always acknowledged. Now you have a SYNCHRONOUS data bus, and you can throw out about 93% of Motorola's application information on the 68000. Take another look at our logo: get it?
Our prototype 68000 board has a minimum configuration of 17 ICs, many of which are actually used to interface the CBM 8032 which is the host processor for the board. However, most of the readers of this newsletter will be more interested in the use, rather than design of, a 68000 system. Those readers should be advised that they can attach a 68000 board to their Apple II or Pet/CBM system for about $600, or even a lot less if they can solder and test their own board.
Now is the time to reveal who "we" is, folks. DTACK GROUNDED is the fictitious name chosen by Digital Acoustics, Inc. as a trade name for our line of 68000-related software and hardware. Digital Acoustics has been in business since 1973 and shipped its first microprocessor based product early in 1974. We have in the past built instruments which used the Intel 4004 and 4040. We currently build unattended environmental noise monitors using the 6502 processor. DTACK GROUNDED grew out of a research project to determine the feasibility of using a 68000 based processor in the next version, or v.04 as we call it, of our DA607P environmental noise monitor.
Before you fall down laughing about the use of a 68000 in a glorified sound level meter, you should be advised that our DA607P v.03 carries a price tag of $7995 and that it has a built in thermal printer with HIRES graphics. It was released in July, 1980 and uses a 6502 processor with a real time clock that WORKS. By way of contrast, the Apple III was released sometime around Dec, 1980 with a real time clock that did NOT work (the Apple III is manufactured by the Apple Computer Company, which you may have heard of).
The latest model environmental noise monitor made by our major competitor (name on request) uses an Intel 4040 processor! For some reason, our sales have been much higher lately than those of our major competitor.
The problem, fellows, is that the 6502 is about to go the way of the 4004/4040. Very soon, like early next year, your neighbor's kid is going to come home with a Trash 80 IV based on an Intel 8086, and he is going to LAUGH at your 6502 based system the same way we at Digital Acoustics laugh at our competitor's 4040 based instrument. What really hurts is that he will be laughing with justification. The 8086 actually CAN run circles around a 6502 based system.
Now, a short eulogy: the 6502 has been very good to a lot of people, and for a lot of companies. Most of you are aware that it was by far the best performing microprocessor of its generation. Its generation has now passed (you know about this, of course, otherwise you would not be reading this newsletter). REQUIESCAT IN PACE.
We know deep down inside that our 6502 systems are beginning to sprout turkey feathers. The problem is not whether to stay with the 6502. The problem is to select the correct microprocessor to use next. Several companies and at least one 6502 based publication have decided that the 6809 is the logical successor to the 6502. Obviously, we disagree.
Let's talk about market windows. The Intel 4004/4040 was the most advanced microprocessor of its day (it was the ONLY microprocessor of its day!). The market window for the 4004 opened in 1972, and slammed shut when Mos Technology introduced the 6502 for $25, quantity one, in the summer of 1975. The problem with the 6809 is that it very nearly missed its market window. I have seen one announcement of the 6809 dated Feb. 1977. Had the 6809 reached the marketplace in 1978 it would have been enormously successful. Today, the market window is closing for the 6809. The window will slam shut on the day that the price of the 68000 drops below $XX. (The exact value of XX depends on familiarity and acceptance of the 68000 in the marketplace. We intend to assist in this regard.)
The market window of the 68000 is just opening. Because the architecture and microcode are both extensible, the 68000 will probably be a viable performer for a long time (nothing lasts forever). Imagine, if you will, microcoded string and floating point instructions, a 32 bit data bus, writeable microcode... All of these things are in the future of the 68000.
Why is the market window closing on the 6809? Because it is SLOW, that's why! If you want to add two 32 bit numbers, you have to do the usual LDA, ADC, STA four times in 6502 style (the 16 bit add in the 6809 is not an add with carry). And, since the 6809 is not pipelined, this will take 1/3 LONGER than the 6502 to do the job. Please understand that I am NOT claiming that the 6809 is not, overall, superior to the 6502. What I AM saying is that it is terribly slow considering that it came along 5 years later.
Now, consider the 68000: suppose you want to add those same two 32 bit numbers. Let the first 32 bit number be in D0 (data register 0) (you DID know the 68000 has 32 bit registers, didn't you?) and the second in D1. Let the result be returned to D0. Here is the code for this:
D0 81 ADD .L D1, D0
This is a one word (two byte) instruction which takes 0.75 microseconds (8MHz). The 6809 requires 24 bytes and 48 microseconds to do the same job. THAT'S 64 TIMES SLOWER! You want to add two 64 bit numbers?
D0 82 ADD .L D2, D0 D3 83 ADDX .L D3, D1
Two words, 1.75 microseconds. You still want to use the 6809? We have one word for you: Goodbye!
We hereby officially declare the formation of the "Simple 68000 Systems Chowder and Marching Society". This society has no dues, rules or bylaws. Its charter is obvious. Only those who are interested may join. We at DTACK GROUNDED (Digital Acoustics, Inc.) will do everything we can to reduce the admission fee.
For 6502/68000 systems to become a reality, we need hardware, software, and an information channel. None of these three prerequisites can exist independently. Perhaps the most important is the knowledge that it is possible to build simple, inexpensive systems using the 68000, which we herewith present. We now pause for a message from our sponsor (you KNEW there was a commercial coming, didn't you?).
DTACK GROUNDED is going to sell boards, assembled at various levels, which will attach the 68000 plus its own resident memory to the Apple II or Pet/CBM as a peripheral device. We call this an "attached processor". Because we are a small company, we can afford to sell these boards at a small profit. Because we are greedy, we want to sell lots of boards. To sell lots of boards, we have to GIVE AWAY lots of software, beginning with a monitor, cross assembler and a floating point package compatible with Apple II and Pet/CBM Basic. We also have to publish a newsletter (you are reading this, no?), priced below cost if the time required to write it is counted.
As I write this, there are about 300,000 systems out there using the 6502 and Microsoft Basic. Nearly all are Apples or Pets. Nearly all have 40 column CRTs. Perhaps half have disk(s) and a printer attached. Now, a LOT of these systems are used by dentists to mail monthly statements or by high school students to play games. None of these people will be interested in a 68000 attached processor on the first or second round.
First or second round? On the FIRST round, only the hard core hobbyist, probably a hardware hacking type, will be interested. This person follows the electronics industry journals, and is well aware of the advantages of the 68000. There are about 1000 of these guys, and 993 of them own Apples (this writer is one of those 1,000 and owns a CBM 8032/8050 system: 80X25 CRT, over 1 Megabyte on line.). These guys will buy to be the first on the block, and also because it is otherwise difficult to get a 68000 development system for about $600. The easiest way to distinguish a working engineer in industry from a hardware/software hacking hobbyist is by the clock. If it's after 5 PM, he's a hobbyist.
At the START of the first round, we have a monitor, floating point package, a logarithm routine with hooks to Basic that runs 14 times faster (at 8 MHz) and a VERY crude cross assembler, all available now and included FREE with our boards. At the END of the first round, we will have all of the standard transcendental functions (SIN, ATN, etc.) with hooks to Basic, and published hooks for the compiler/Pascal/C software types so they can come on board when ready (they will not come on board on the first round, because 1000 users is not a viable user base). A cross assembler which is merely crude will be available. All of this software will come free with the board, or priced to cover printing and mailing costs to registered board owners.
(This marketing practice is called "bundling", and has been ruled to be illegal IF the company is in a dominant position in its industry.)
By the end of the first round, we should begin to see some software surfacing from the user group. Some of this will be commercial, but early on it will probably be "hey, look what I made my 68000 do, and look how fast it runs" stuff. We would very much like to publish such software in the DTACK GROUNDED newsletter, with appropriate attribution. This would, of course, turn the newsletter into a journal. We understand that there is a very fine flight simulator program with 3 dimensional graphics that runs on the Apple. What if it were possible to increase the frame update rate by a factor of 10 to 20?
Incidentally, we love games. Games are as American as apple pie. Games are very useful to demonstrate computers to the great unwashed. But please submit 68000 code for tic-tac-toe games elsewhere. We think DTACK GROUNDED readers will be mostly interested in more serious stuff.
Now, the SECOND round: Some software hackers are going to come on board because they have seen one of the first round buyers' units run FAST. Some non-hackers (species "user vulgaris") will come on board because of a friend's recommendation and because the price of admission is declining (the 68000 price will be dropping as more second sources come on line and as product yield improves). Some small company industrial types are going to come on board because this is the cheapest way to get a 68000 development system, and because we intend to license our floating point packages at very reasonable one time fees. The cross-assembler will continue to improve, but is unlikely to result in invidious comparisons to Carl Moser's 6502 assembler.
A 61 (yes, sixty one!) bit floating point package, 14 decimal digits precision, will be made available complete with transcendentals and a calculator package which emulates the old Wang 600 calculator (this package exists NOW in 6502 code, but without most of the transcendentals). It will NOT be IEEE compatible for reasons to be given in a future issue of this newsletter.
An 80 bit floating point package will be written, with transcendentals, so that an error analysis of the 61 bit and the Microsoft Basic compatible packages can be accomplished (the integer portion of the divide and multiply routines of the 80 bit package already exist and have been used to check the existing 9 digit Microsoft compatible package).
Since the 80 bit package will have 18 decimal digit precision it will make a very effective, reliable 16 decimal digit package, which is what the business people really need. This package will be licensed at VERY reasonable rates for the business programming people. The four basic functions (add, subtract, multiply, divide) will run about five or six times faster than the 6502 is now running with its 9 decimal digit package, which is not really adequate for business use. The transcendentals will run about three times faster than the existing 9 digit 6502 package, and should prove very useful to those scientific types, such as astronomers, who need high precision arithmetic.
During the first part of the second round, we will provide software hooks to intercept the "find variable" subroutine which looks up the location of the floating point variables for transfer to one of the two floating point accumulators. The numeric variables will then be located in the 68000 memory space, not the 6502 memory. This will greatly speed up the basic interpreter by eliminating the constant data transfer between the two processors and by allowing the 68000 to locate the variables, which it can do somewhat faster (ahem!) than the 6502.
Toward the end of the second round, we will provide hooks to transfer the "expression evaluation" subroutine from the 6502 to the 68000. The 6502 will now have little to do except perform string functions and monitor the keyboard (the 6502 is very good at monitoring keyboards). In other words, the 6502 portion of your system will be approaching the status of an I/O processor.
During the second round, some commercial software will begin to appear. I expect the first quality commercial software will take advantage of the 68000's extremely fast integer arithmetic to greatly speed up Apple II graphics.
Once the 80 bit floating point package is introduced, some general ledger business systems programmers will begin to use the 68000 board, either with the CBM 8032 or with an Apple with an 80 column CRT adapter board. In either case, a reasonable sized disk will be required. This means that the cost of the 68000 board will be (and IS NOW) a very small part of the total cost of the system. General ledger programmers who do NOT use the 68000 board will find themselves at a competitive disadvantage.
This takes us up to the beginning of the third round, which starts when someone, probably a British software house specializing in the 8032/8096 business computer, will introduce a compiled Basic which makes effective use of the power of the 68000. The next issue of this newsletter will explain why the marriage of compiled Basic and the 68000 is one that is made in heaven and why you should be miserable because you do not have this combination available to you RIGHT NOW.
Above is an optimistic scenario which can come true. An essential part is the continuation of this newsletter and your contributions to it, which will turn it into a journal as noted.
Another part is that we at DTACK GROUNDED have to sell enough boards to make it profitable to give away software, or find other related ways to make a profit. One way to do this is to offer the software separately at a low, but profitable in large quantity, price.
But, as we said previously, our main purpose is to make a little money on each board we sell.
How can we compete with 17 copycats who do not have our overhead because they are not busy writing software to support this program? Very simply. Our software will always be distributed with large, easily readable copyright notices. It happens that the copyright law in this country is clear and unambiguous. For this reason, copyright lawsuits tend to be open and closed affairs (the recently famous Data Cash lawsuit involved ROMs which were NOT marked with a copyright notice).
Does this mean that we will actually sue a computer club which copies and distributes our software to its members? Not unless done so for purely commercial reasons. The people we would be inclined to sue would be our competitors (people or companies which sell copycat boards, which is legal and aboveboard, but distribute our software with them, which is strictly illegal).
Actually, there is some stuff, particularly 68000 source code, which we don't even want the clubs to copy. That's why the last part of this newsletter is printed in black ink on dark red paper. Look, fellows, if we published our $10 floating point source code package in black ink on white paper there would be a 31st generation photocopy in Kabul, Afghanistan by Sunday afternoon!
HARDWARE STATUS REPORT:
The prototype 68000 board is operational, and has been since about June 1. A qualified consultant has been retained to design an interface board for the Apple II (Burtronix did the Microsoft Softcard board design, for instance). The design was turned over to us on July 9, and the printed circuit artwork for this interface board was completed, except for the DTACK GROUNDED logo, on 14 July. Today's date is 15 July, and this part of this newsletter will be revised in future days to keep you up to date.
The layout of the 68000 board, with sockets for 60k bytes 200 nsec static ram and a memory expansion connector will begin next week, July 20. After prototype testing (with assistance from our Apple consultant), an initial order of 250 circuit boards, with solder mask and component layout silk screen, will be placed. The 68000 board artwork will then be modified to the CBM 8032 interface configuration, which includes a private 2K ram for the 8032 mapped into the second half of the CRT memory space. A 64K memory expansion board will be available sometime in October.
The 8032 version mounts INSIDE the 8032 and is therefore shielded. The Apple II version obviously has the main board mounted OUTSIDE. However, we understand that the first 200,000 Apple IIs sold can't be used anywhere near a TV set anyway. Nevertheless, how to shield the 68000 board when used with the Apple will have to be looked into.
SOFTWARE STATUS REPORT:
If you send us $10 plus 50 cents postage (U.S. only) NOW, you will get twenty-plus pages of 68000 floating point (the Microsoft Basic compatible 9 decimal digit version) source code by return (1st class) mail (Calif. residents add 6%). At this price, the package is obviously offered with no support and no license for its reproduction in any form is offered or implied. However, the package has been carefully tested and there are no known bugs. Should a bug be located after the fact, a correction will be printed in a future issue of DTACK GROUNDED.
Not incidentally, purchasers of the floating point package will be interested to know that the innards of this package will be examined in detail in this and future issues in the second, secret (i.e., difficult to photocopy) section of DTACK GROUNDED as the tutorial on 68000 software. This benefits the software buyer by explaining how the 68000 code works, in case you are not familiar with 68000 code. The DTACK GROUNDED subscriber benefits even when not interested in the FP package, since the tutorials are thus guaranteed to cover tested, functional, useful code. Or haven't you ever seen "code" published in magazines and textbooks which had obviously never been run, and obviously never could run?
If you would like to subscribe to this newsletter, send $15 for six issues to:
1415 E. McFADDEN St. F
SANTA ANA CA 92705
Apple, Apple II, Apple III are trademarks of the Apple Computer Company. Pet and CBM are trademarks of Commodore Business Machines, Inc. DTACK GROUNDED is a trademark of Digital Acoustics, Inc. As far as we know, Trash 80 IV is not, either in part or in whole, anybody's trademark.
If you received this newsletter as a free sample, this is the end (if you received it in reply to our magazine S.A.S.E. information offers, this is printed half size to reduce postage costs). For subscribers, the next six RED pages are a tutorial on using the 16 bit unsigned multiply to generate a 40X32 bit multiply with 40 bit result, which is what is needed to be compatible with the Microsoft floating point multiply (but 31 times faster).
The next issue of DTACK GROUNDED will have complete pricing and ordering information on the hardware.
If you are still with us this far, and plan to stay with us, it is absolutely essential that you get a Motorola 68000 specification sheet and a User's Manual, Motorola part # MC68000UM(AD2).
WELCOME TO REDLANDS, CASH CUSTOMERS!
in re FLOATING POINT:
Some definitions of terms and other clarifications are in order. First: if you do not already know the difference between floating point and integer representation of numeric values, then it is highly unlikely that you will catch on to most of what is happening in the 68000 code. What you should do is get a good book in the subject, and come back to this material when you DO understand the difference.
I can't think of a book which is clear, concise, inexpensive and readily available on this subject. BYTE publishes a book which is a collection of articles previously printed in BYTE magazine on the representation of numbers in computers, titled "NUMBERS IN THEORY AND PRACTICE". This book is presumably at least available. If any reader can recommend a better book on this subject, please drop a line to the editor of DTACK GROUNDED.
A short review: the four fundamental floating point operations are add, subtract, multiply, divide. These are dyadic operations, meaning we need TWO operands to obtain a result. Add and multiply are commutative, meaning A+B = B+A and A*B = B*A. However, subtract and divide are not commutative; we have to make a decision who does what to whom. In a Microsoft compatible package, the decision is made for us. Let FPACC#1 be the floating point accumulator, where the result of an F.P. operation is left. FPACC#1 has a guard byte, remember. We subtract FPACC#1 from FPACC#2 (by changing the sign of FPACC#1, then proceeding to perform a signed addition). When dividing, Microsoft first rounds FPACC#1, eliminating the guard byte. They then divide FPACC#1 into FPACC#2.
A floating point accumulator has a sign (positive or negative), an exponent and a mantissa. FPACC#1 also has a guard byte (in Microsoft Basic). We call the sign bit in FPACC#1 S1, the exponent X1, the mantissa M1 and the guard byte G1. The corresponding descriptors for FPACC#2 are S2, X2, M2 and G2. G2 is (initially) by definition zero, but is needed to perform addition or subtraction with G1.
If you are not familiar with assembly language, you will have a lot of trouble (at first) with the material to follow, but do NOT rush out and buy a book, at least for now. Assembly language programming, like riding a bicycle or participating in a certain popular indoor sport, is best learned by doing. After you are a competent assembly language programmer, a book on the subject may provide interesting historical information on the subject as well as the correct names and descriptors for what you have already been doing.
When discussing the 6502, the data lines are named D7 thru D0 and the address lines are named A15 thru A0. Motorola has chosen to name the data registers in the 68000 D7 thru D0, and the address registers A7 thru A0. Therefore, we will use b15 thru b0 to describe the data lines, and b31 thru b0 to describe the individual bits in a register. When is becomes necessary to discuss address lines, we will for the time being use the name "address line(s)" to describe the address line(s). This is not very original and certainly not very compact, so we are open to suggestions on this matter.
We assume that you have previous experience with assembly language on an eight bit microprocessor, probably the 6502. Be advised that the single most important assembly mnemonic for the 68000 is MOVE. This single mnemonic replaces LDn, STn and Tnn. In other words, MOVE is used for loads, stores and transfers.
Whether one or three descriptors are used for these three operations is arbitrary, but don't get the idea that the subject is not considered important. There was an intense battle at the very top echelon of Intel over MOVE vs LD, ST, TFR a few years ago. The loser founded ZILOG. Have you ever compared 8080 and Z80 source code?
MOTOROLA 68000 ARCHITECTURE:
Now for a VERY short introduction to the 68000 architecture: it has eight general purpose data registers, each of which has 32 bits. It has eight 32 bit address registers, one of which (A7) is dedicated as the stack pointer. That's 64 bytes of on board registers. This compares to 4 bytes for the 6502: the accumulator, X, Y and stack pointer. We have recently been writing code using this wealth of on-board storage, and let us tell you it's FUN!
NOW LET'S GET ON WITH IT
Take a peek at pages 13 and 14. That's the actual code used to perform the integer multiply in our Microsoft compatible nine digit floating point package. M1 is a four byte memory location containing the mantissa of FPACC#1. G1 is the next byte in memory, and is the guard byte used by Microsoft in their floating point package to improve the accuracy. We reserve the next byte as a "garbage" byte since the 68000 can't perform byte multiplies. M2 is the location of the four byte mantissa of FPACC#2. M2 has no guard byte.
Following is a description of what is happening in the Motorola 68000 source code on pages 13 and 14: first we MOVE the guard byte and the garbage byte as a word to the lower 16 bits of data register 0, or D0. The ".W" following MOVE specifies a word (16 bit) operation. Then we clear the lower byte of D0 by using the CLR (clear) instruction, with the ".B" specifying an 8 bit operation. The guard byte is left in b15 thru b8 of D0.
Now we MOVE D0 to D3. Note that the size of the move is unspecified. The assembler defaults to ".W" in the absence of a size specifier. Now, we wanted a 16 bit MOVE operation to have a copy of the guard byte in D3. However, this is very poor style, especially for a beginner (which we are!).
The next two instructions MOVE the lower order (but higher in memory) 16 bits of M1 into D1, with a copy in D4. The next two instructions MOVE the high order 16 bits of M1 into D2, with copy in D5.
We now have two copies of the 40 bit M1+G1 of FPACC#1 inside the 68000, the first copy in D2, D1, D0 and the second copy in D5, D4, D3. We have no idea what is in the upper 16 bits of these six data registers, but that's O.K. because the 16 bit multiply instruction ignores the upper half of the registers.
Now we move the lower order half of M2 into D7 (get ready, everybody, on the next instruction we are, finally, going to perform an actual HARDWARE MULTIPLY!). Now we execute "MULU D7, D0", which requires 8.75 microseconds at 8 MHz. This instruction replaces the contents of D0 with the 32 bit result of the 16 by 16 bit multiplication. Aren't 32 bit registers handy? D7 is unchanged, so we continue, multiplying times D1 and then D2.
Now that we have all the partial products involving the lower order portion of M2, we MOVE the high order half of M2 to D7 and multiply it times D3, D4 and D5. We now have all of the partial products (six of them) inside the 68000. At the bottom of page three is a diagram of the contents of D0 thru D5, followed by a symbolic representation of the desired result.
After adding the partial products, that portion of the result below GG will be discarded, since we want only the most significant 40 bits of the result. However, we must first determine whether the addition of D0 thru D4 will generate carries into the portion of the result we want to keep. Therefore, we can only throw away the lower order bits when it is impossible for them to be added to another low order result, possibly generating a carry.
Looking at D0, we see that we can immediately discard the lower half of the register. We also have the problem that we want to add the UPPER half of D0 to the LOWER half of D1. Continuing with the code on page 4, we CLR (clear) the lower 16 bits of D0 by specifying ".W" for a 16 bit operation. Then we SWAP the upper and lower 16 bits of D0. The upper 16 bits of D0 are now all zero, and the former high order bits of D0 are now in the low order position... which is exactly where we need them for an add into D1.
Now we perform a 32 bit ADD, adding D0 to D1 with the result left in D1 (isn't it nice to have 32 bit instructions?). Unlike the 6502, it is not necessary to clear the carry first. However, the carry bit will, if appropriate, be set as a result of this add. The next instruction (BCC MUL1) tests the CY and skips if the carry is zero. If the carry is set, we do not skip and instead add one to D5 (a quick look at the diagram on the bottom of page 3 will show that a carry propagating out of D1 should proceed to D5 in bit position 0). ADDQ means "add quick", which can be done using data from #1 to #8 in a one word instruction, which is why it is called "quick".
D1 and D3 are aligned already, so we add them, leaving the result in D3. Once again, we test for a carry and add one to D5 if CY = 1. Now the lower half of D3 is useless, and we have to align the UPPER half of D3 with the LOWER half of D2. So: CLR .W D3, then SWAP D3. We can now add D3 to D2, and so we do! But look at the diagram again: any carry generated must be added into bit position 16 of D5. Now we can't use ADDQ to correct D5, so we add the 32 bit immediate data $00010000 to D5 if a carry is generated adding D3 to D2, and again on adding D2 to D4.
Let us carefully consider the contents of D4 at the location labelled "MUL4": The upper half of D4 must be added to the lower half of D5. But before we clear the lower half of D4, remember that it now contains our guard byte in bit locations 15 thru 8. So, first we store our guard byte in D0. Now we do the familiar CLR .W, SWAP, ADD .L. However, we do not have to test for a carry because no carry is possible on this add.
We now have the most significant 32 bits of the result in D5 and the next lower 16 bits in the lower half of D0. In a floating point multiply (Microsoft compatible version) the most significant bit of M1 and of M2 are both ones. The highest possible value of M1, G1 is $FFFFF and the lowest is $8000. The highest and lowest values of M2 are $FFFF and $8000. The highest possible result is then $FFFEF0001 and the lowest is $400000000 (because we actually perform a 48, not 40, bit multiply times 32 bits we actually have another hex zero appended to those results).
The lowest possible result is not in normalized mantissa form with bit 31 = 1, in which case the mantissa must be normalized. However, this is not part of the integer multiply, which is completed as of the next to last instruction on page 14.
We hope this short introduction to Motorola 68000 programming has been informative, and has whetted your desire to get your hot, sweaty hands on that beautiful 64 pin ceramic device! Although it took a while to describe how the integer multiply worked, you may be interested to know that the 68000 executes that code in about 76.5 microseconds (8 MHz), which is about 31 times faster than the 6502!
MISCELLANEAE in re ADDRESSING:
About address line 0: the 68000 doesn't have one. Since the data bus is 16 bits wide, running address line zero, which differentiates between bytes, to word locations doesn't make sense. The 68000 instead has two data strobes, one for b15 thru b8 called UDS and one for b7 thru b0 called LDS. This stands for Upper Data Strobe and Lower Data Strobe, respectively. If a data byte is being written from a data register to memory at an odd location, the lowest 8 bits of D0 are placed on b7 thru b0 of the data bus and LDS is asserted. When writing to an even memory location, the same lowest 8 bits of D0 are placed on b15 thru b8 and UDS is asserted. When performing a word or long word operation, the address can definitely and positively NOT be odd. More on this next issue.
Although the address registers are actually 32 bit registers, the 8 highest order bits are not brought out on the 68000 package. As a result, the 68000 is restricted to a mere 16 megabyte addressing range, all of which can be accessed without bank switching, segmentation or other fun techniques. Following is the code needed to transfer one million bytes (decimal) from one location in 68000 memory to another:
MOVE .L #1000000, D0 MOVE .L #LOC1, A0 MOVE .L #LOC2, A1 XFR MOVE .L (A0)+, (A1)+ SUBQ .L #1, D0 BNE XFR
Try THAT on your friendly 8086, Z8000 or bank switched 6809!
And now, as the sun sinks slowly in the west, we bid you a fond adieu until the next issue of DTACK GROUNDED.
1 OPT P=68000,BRS,FRS 00414A 2 ORG $00414A 3 4 ******************************************** 5 *THE FOLLOWING CODE MULTIPLIES 32 * 40 BITS* 6 ******************************************** 7 00414A 3038 702E 8 MUL MOVE.W G1.W,D0 00414E 4200 9 CLR.B D0 004150 3600 10 MOVE D0,D3 004152 3238 702C 11 MOVE.W 2+M1.W,D1 004156 3801 12 MOVE.W D1,D4 004158 3438 702A 13 MOVE.W M1.W,D2 00415C 3A02 14 MOVE.W D2,D5 00415E 3E38 7034 15 MOVE.W 2+M2.W,D7 004162 C0C7 16 MULU.W D7,D0 004164 C2C7 17 MULU.W D7,D1 004166 C4C7 18 MULU.W D7,D2 004168 3E38 7032 19 MOVE.W M2.W,D7 00416C C6C7 20 MULU.W D7,D3 00416E C8C7 21 MULU.W D7,D4 004170 CAC7 22 MULU.W D7,D5 23 24 ******************************************** 25 * MULTIPLICATION IS COMPLETE * 26 * NOW ALIGN AND ADD THE PARTIAL PRODUCTS * 27 ******************************************** 28 29 *D0 = HHHH HHHH 30 *D1 = HHHH HHHH 31 *D2 = HHHH HHHH 32 *D3 = HHHH HHHH 33 *D4 = HHHH HHHH 34 *D5 = HHHH HHHH 35 *PRODUCT = MMMM MMMM GG 36 *WHERE M= 32 BIT MANTISSA, G= 8 BIT GUARD 37 004172 4240 38 CLR.W D0 004174 4840 39 SWAP.W D0 004176 D280 40 ADD.L D0,D1 004178 64 02 41 BCC MUL1 00417A 5285 42 ADDQ.L #1,D5 00417C D681 43 MUL1 ADD.L D1,D3 00417E 64 02 44 BCC MUL2 004180 5285 45 ADDQ.L #1,D5 004182 4243 46 MUL2 CLR.W D3 004184 4843 47 SWAP.W D3 004186 D483 48 ADD.L D3,D2 004188 64 06 49 BCC MUL3 00418A DABC 00010000 50 ADD.L #$00010000,D5 004190 D882 51 MUL3 ADD.L D2,D4 004192 64 06 52 BCC MUL4 004194 DABC 00010000 53 ADD.L #$00010000,D5 54 55 56 ******************************************** 57 *STORE THE GUARD BYTE IN D0,BITS B15 THRU B8 58 ******************************************** 59 00419A 3004 60 MUL4 MOVE.W D4,D0 00419C 4244 61 CLR.W D4 00419E 4844 62 SWAP.W D4 63 64 ******************************************** 65 * NO CARRY IS POSSIBLE ON AN ADD TO D5 * 66 ******************************************** 67 0041A0 DA84 68 ADD.L D4,D5 0041A2 6B 08 69 BMI MULX SKIP IF D31 = 1 70 # 000041AC 71 MULX EQU $0041AC # 0000702E 72 G1 EQU $00702E # 0000702A 73 M1 EQU $00702A # 00007032 74 M2 EQU $007032