DMA Techniques for Personal Computer Data Acquisition Introduction Data acquisition generally involves sampling some set of "real world" signals at a regular rate, and storing the results for processing and display. Enhanced with data acquisition hardware, a personal computer is an excellent vehicle for this sort of activity. It can contain all the elements of the data acquisition system--system control, data storage, data manipulation, and report generation--at low cost. The ubiquitous IBM PC (and compatibles) is an excellent choice because of the richness of hardware and software enhancement products available for it. There are three basic techniques available for accomplishing the sampling task -- polling, interrupts, and DMA, or Direct Memory Access. While the main thrust of this discussion is DMA, the other two techniques deserve mention. Using the polling technique, the data acquisition system generates a clock pulse to signal the computer to sample the data. The clock can either be a stable, regular pulse from a crystal oscillator, or it can be generated by some external event. The computer's program is in a loop waiting for this clock. At each occurrence, it samples and stores the data. This technique has the advantage of being very simple to implement, and quite fast (up to 200,000 12-bit samples/sec in a '386 based machine using assembly language). The major disadvantage of polling is that it monopolizes the computer during data acquisition. Even the PC's keyboard and clock interrupts must be disabled to avoid missing data samples. Interrupt-driven data acquisition also requires some sort of clock signal to indicate when it is time to sample and store the data. In this case, however, the clock signal generates an interrupt to the PC, and the interrupt handling routine samples and stores the data. The computer is not in a program loop waiting for the clock. This overcomes the major disadvantage of polling -- the computer is free for other tasks as well as data acquisition. Each time the processor receives an interrupt, it has to save the contents of all its registers, so that it can pick up where it left off when it returns. This additional overhead makes interrupts much slower than polling (around 10000 12-bit samples/sec in a similar machine). An interrupt routine is also much more difficult to implement in software. Finally, in order to maintain a gap-free regular data acquisition rate, it may still be necessary to disable the PC's clock and keyboard interrupts. The third data acquisition technique is Direct Memory Access. Using this technique, the data acquisition hardware sends data to ( or receives data from) the computer's memory directly. DMA also uses a clock signal as do the other two techniques. In both polling and interrupt driven data acquisition, however, the data is retrieved from the hardware and transferred into memory by computer instructions in the user's program. DMA is a method by which the hardware takes control of the computer's data, address, and control busses, and interacts with memory directly without processor intervention (See Fig. 1). Each time the sample clock "ticks", data is directly inserted into memory. Since it proceeds without software intervention, DMA is also a fast method of acquiring data -- speeds of up to 180,000 12-bit samples per second can be achieved on a PC/XT. It also has the advantage of operating as a true background task. Since the user's program does not have to deal with the business of acquiring each sample, it is free to perform other tasks. The computer's keyboard and timer interrupts can remain active since they will not affect DMA operation. DMA in Data Acquisition Personal computers were originally designed to increase productivity in office automation applications. Among the first software products to appear were word processors, spreadsheets, and database managers. The fundamental difference between these sorts of applications and Data Acquisition is the extent to which they must react to the "real world." Office automation software responds to inputs from a human operator -- generally keystrokes. The potential frequency of these inputs is very low compared to the processing speed of the computer, even for the fastest typist in the world. This is not the case in Data Acquisition. Consider a system capturing an audio frequency input from an A/D converter. To achieve a bandwidth of 20 KHz, such a system would need to sample the input at 40,000 samples/sec, or once every 25 microseconds. If the input is sampled at a frequency lower than twice the signal frequency, "aliasing" errors will occur. In this case, higher frequency components of the input waveform "masquerade" as lower frequency components, causing potentially significant errors in signal re-construction. A good typist may enter 5 to 8 characters per second through the keyboard -- three orders of magnitude in frequency below what would be required of the data acquisition system. Data Acquisition, then, is very much a "real time" activity, and speed of response is a critical concern. DMA is a good way to provide this speed without seriously taxing the computer's ability to perform other tasks. It is not enough, however, to acquire a signal very quickly. In order to accurately reconstruct a waveform, one must know not only the value of a signal, but also the time at which the signal was sampled. The simplest way to know the time at which a sample was taken is to take all the samples at a regular rate. The time of any given sample can then be computed by knowing the exact time between samples. This method of sampling depends heavily on the sample rate being extremely regular. Variations in sample-to-sample timing, or "jitter", can lead to significant errors. Consider a 1 KHz sine wave at 20 V peak-to-peak being sampled by a 12-bit A/D converter with a timing uncertainty of only 1 usec. At worst case (around 0V where the rate of change of the sine wave is highest), the error generated by such an uncertainty would be: Error Voltage = 10*SIN(2*pi*f*1 usec) = 62.8 mv The 12-bit A/D will divide the 20 V full scale input range into 4096 equal parts or "counts". An error of 62.8 mv would correspond to .0628 V * (4096 counts / 20 V) = 12.9 counts! Such an error reduces the system accuracy to about what would be expected of an eight bit system. Clearly, this could easily be the most significant error in the system. The maximum timing jitter allowable that would give 1/2 count or less of error for a 12 bit system would be: T = ARCSIN(Error Voltage / Volts Full Scale) / 2*pi*f In this case T = ARCSIN(.00244 V/10) / 2*pi*1000 = 39 nsec ! The only way to achieve such stability in a computer is to have a clock derived from a crystal oscillator directly start A/D conversions. The DMA process would then be paced by the A/D's End of Convert Signal to ensure that only valid data are transferred. DMA for data acquisition, then, must be paced by a very regular clock. Besides being more demanding in terms of frequency than office automation, data acquisition must also be "event driven" to a much greater extent. That is, it must respond and synchronize itself to asynchronous events over which it may have little or no control. Often, it is not useful to take a random "snapshot" of 10,000 precisely timed samples of a signal. Often, one would really like to take a snapshot only of the 10,000 samples which contain some event of interest. In these cases, the optimal snapshot would contain data leading up to the event, and data following the event -- pre- and post-trigger data. The technique for achieving this is well known, and requires three basic elements -- a "circular" memory buffer, some sort of hardware trigger, and a delay counter. Data is acquired continuously into the circular buffer. When the buffer is full, it automatically wraps around and begins filling again, overwriting old data with new. When the hardware trigger occurs, the delay counter begins counting a pre-programmed number of additional samples. When the delay counter exhausts its count, data acquisition stops. The buffer is left with samples taken both before and after the trigger (Fig. 2). The oldest sample in the buffer is the one which would have been written next if data acquisition had continued. The newest sample is the one just written, and the trigger sample is located by going backwards through the buffer by the number of samples in the delay. This capability is partially inherent in the Personal Computer. Its DMA Controller has an "Autoinitialize Mode", described below, which can be used to create the required circular buffer. The hardware trigger and delay counter, however, must be built in to the data acquisition hardware. The Burr-Brown PCI-20000 Modular Data Acquisition System was the first PC-based data acquisition system to offer this capability. All major competitors in this area now offer some capablities like this. .cp 3 Survey of DMA Techniques The classical DMA technique which is supported by the IBM family of PCs is the "device to memory" technique. A single external device (for data acquisition, normally an A/D or D/A converter) communicates with the host computer's memory using the computer's DMA handshaking protocol. Fig. 3 shows a simple block diagram and memory map of this process. For purposes of data acquisition, this technique is illustrated in its simplest form in Fig. 4. An external timing source starts an A/D conversion directly. When the conversion is complete, the A/D's data is transmitted directly to memory. While this process is very fast, it has the disadvantage that only one channel of analog data is captured and transmitted. There are no provisions for multiple channels of analog data, or for any other type of data. The addition of a multiplexer and programmable counter, as shown in Fig. 5, allows the acquisition of multiple analog channels. Conversions are still started by the external timing source. Now, however, when a conversion starts, the counter advances the multiplexer to the next channel in the sequence. Advancing the counter at the start of conversion rather than at the end allows the multiplexer to settle on the next channel while the current channel is converting. The sample/hold amplifier stores the value of the current channel while the current conversion is in progress. The counter can be made such that it can scan the first N channels in sequence, or the last N channels. Each time a conversion starts, a different channel is converted and transmitted. Some technique similar to this is employed on virtually all modern DMA-compatible data acquisition boards. Another level of utility can be achieved by inserting a "list memory" between the counter and the multiplexer (Fig. 6). Instead of selecting an analog channel, the counter will select a memory location in the list memory. This memory location can contain the code for any channel, as well as the code for a gain, if a programmable gain amplifier is used in the system. As above, each time a conversion starts, the counter advances. In this case, however, it advances to the next memory location, and the contents of that memory location specify the channel for the next conversion. This technique offers several advantages. The list of channels to be scanned does not have to be sequential. Any random channel numbers can be programmed in the list. If some channels need to be scanned at higher frequencies than others, they can be repeated in the list. For example, suppose channels 7, 2, and 5 were to be scanned, and channel 2 contained higher frequency information than the other two. A channel list could be constructed to look like: 2,7,2,5. Channel 2 would be scanned at twice the rate of the others. If a programmable gain amplifier is used in the system, and its control bits are included in the list memory, then each channel can have a different gain. Normally, this is not possible under DMA control since gain has to be set with software. The random-channel scanner first appeared in PC-based data acquisition products in 1985, again in Burr-Brown's PCI-20000 system. Since that time, it has become available from most suppliers of such boards. A New Technique All of the techniques listed above are available in commercially available plug-in data acquisition boards. All of these schemes have one limitation in common, however: they can only monitor a single A/D converter under DMA control, and no other type of device at all. Almost all data acquisition boards have event counters and digital I/O on board as well as an A/D converter. This is because most applications involve more than simply monitoring analog channels. For example, if an application requires correlating analog inputs with a position signal from an absolute shaft encoder, then DMA cannot be used. The analog inputs can be monitored under DMA control, but the digital inputs from the shaft encoder cannot. Alternatively, if an application requires simultaneous sampling from several A/D converters, or to several D/A converters, then DMA could not be used. Only one of the converters could be under DMA control. A technique has been developed that gets around this limitation. Embodied in the Burr-Brown PCI-20041C Data Acquisition Carrier, it allows several (up to 64) data acquisition devices to share the same DMA channel in the PC. Any device in the system can be used -- A/D converters, digital I/O channels, and/or counters -- and they can be mixed together in the same DMA process. The scheme involves a technique similar to the random-channel scanner described above. RAM on the Carrier contains a list of all items to be transferred under DMA control. Each time a DMA transfer is requested (by a pacer clock, for example), the Carrier sends out one "frame" consisting of all the items in the list to a single DMA channel as fast as the computer will accept them. This frame can consist of any mixture of items available on the Carrier or Modules; A/D readings, digital input readings, and counter data can be intermixed as required (see Fig. 7). Up to five Carriers can be linked together in a master-slaves arrangement, allowing data from all five to be under DMA control simultaneously. The DMA list, or frame map, is contained in a block of 128 bytes of memory on the Carrier. Each item in the map consists of two bytes representing the local address of one item to be transferred under DMA control. There can be up to 64 such items in the list. The last item in the frame also contains an "End Of Frame" flag indicating the end of the list (Fig. 8). During a DMA transfer, the PC's address and control lines are removed from the Carrier's local bus, and replaced with the contents of one element from the list. This addresses one byte on the Carrier, causing its data to be placed on the data bus (or taken from the data bus, depending on the direction of transfer). When the transfer of the byte is complete, a counter on the Carrier advances to the next item in the list for the next transfer. After the last item is transferred, the counter is reset, pointing once again to the first list element. A DMA transfer can be requested by several sources in the system. Typical sources of transfer requests are the End of Convert signal from an A/D converter, a pulse from the on-board crystal controlled Pacer clock, an external TTL input pulse, or an interrupt from a Trigger circuit. There are also several triggering methods available to start and stop sequences of DMA transfers. For data acquisition input purposes, the method described earlier ( called Start on Command, Stop on Trigger with Delay) is most useful since it provides both pre-and post-trigger data. In this case, the trigger can either be an external TTL pulse, or it can be derived directly from the analog input signal using a PCI-20020M-1 Trigger/Alarm Module. For continuous DMA output of analog waveforms or digital patterns, the "Start on Command, Stop on Command" mode is most useful. It also involves the use of a circular buffer. An analog waveform can be constructed in memory, and then continuously output through a D/A converter module (or several modules if more than one waveform is desired) at the desired frequency using DMA. An alternative mode allows the DMA output to begin on a hardware trigger. During the DMA output, a user's program can acquire data from the same board using a polling or interrupt technique. If polled data acquisition is timed by the same clock used to generate the DMA output, then the two are synchronized. This can be used to build a stimulus-response system. Conclusion DMA is a powerful technique for data acquisition. It allows high speed data collection, and it makes background operation simple. When enhanced by specialized data acquisition hardware, it can transform a Personal Computer into a full-featured Data Acquisition System with impressive capabilities.