Application design of real-time image processing system based on OMAP5910 dual-core processor

The notable feature of real-time image processing system is the large amount of data. Effective processing and transmission of image data is the key to realizing real-time image processing system. TI has launched high-performance multimedia dual-core processor OMAP5910, which is a high-performance, low-power TMS320C55x processor. DSP and ARM925 microprocessor with strong control performance are integrated into the same chip device, how to effectively utilize the advantages of dual-core, and rationally use various memory configurations of OMAP5910 to configure DMA controller to transmit large-scale image data in real time and efficiently. This is the focus of this paper. .

Authors: Wang Yanyan, Zhi Chen, Yu Rui

The notable feature of real-time image processing system is the large amount of data. Effective processing and transmission of image data is the key to realizing real-time image processing system. TI has launched high-performance multimedia dual-core processor OMAP5910, which is a high-performance, low-power TMS320C55x processor. DSP and ARM925 microprocessor with strong control performance are integrated into the same chip device, how to effectively utilize the advantages of dual-core, and rationally use various memory configurations of OMAP5910 to configure DMA controller to transmit large-scale image data in real time and efficiently. This is the focus of this paper. .

1 Memory management of OMAP5910

Since OMAP5910 supports a variety of memories, it is necessary to have a detailed understanding of the memory management of OMAP5910 when designing a DMA transfer scheme.

The MPU memory of OMAP5910 integrates 192KB of SRAM, and the DSP memory integrates 64KB of bidirectional DARAM, 96KB of unidirectional SARAM, and 32KB of program memory PDROM. The memory mapping of MPU and DSP subsystems is shown in Figure 1. Through EMIFF and EMIFS interface, OMAP5910 can access off-chip memory, but the speed of accessing off-chip memory is very different from the speed of accessing on-chip memory.

The on-chip memory of OMAP5910 is mainly managed by the memory management unit TC. TC manages MPU, DSP, DMA and local bus access to OMAP5910 system storage resources (SRAM, SDRAM, Flash, ROM, etc.), its main function is to ensure that the processor can efficiently access external storage areas, avoid bottlenecks and reduce on-chip Processing speed, TC supports access to memory by processor or DMA unit through 3 different interfaces – EMIFS, EMIFF or IMIF. The EMIFS interface provides access to Flash, SRAM or ROM, the EMIFF interface provides access to SDRAM, and the IMIF interface provides access to the on-chip 192KB SRAM of OMAP5910. The three interfaces are completely independent and can be accessed from any processor or DMA unit. can be accessed simultaneously.

Application design of real-time image processing system based on OMAP5910 dual-core processor

2 DMA controller of OMAP5910

The DMA controller of OMAP5910 is very important to the real-time image processing system. It can complete the data movement in the mapped storage space without CPU participation. The flexible use of the DMA controller can greatly improve the data transmission efficiency.

The characteristics of the DMA controller of OMAP5910 for general function transmission are as follows:

1) Single-channel split operation, with general and dedicated channels, different hardware resource ports. All data exchanges are handshake through the request (Request), ready (ready), abort (abort) signals. The DMA channel is time-division multiplexed, and the basic flow of its transmission is shown in Figure 2.

Application design of real-time image processing system based on OMAP5910 dual-core processor

2) Multi-frame transmission. There can be multiple data frames per block transmitted. The data sizes transferred are 8-bit, 16-bit and 32-bit. Packets, unpacks, and counts bytes transferred. All memory address spaces (physical address map and I/O space) are accessible.

3) The read, write and frame operations of DMA are all interrupt-triggered. Each DMA physical channel can generate an interrupt to make the processor respond to the status of this transfer. All DMA interrupts are level interrupts.

4) Background transfer, high throughput, DMA can work independently of CPU and data throughput at PCU clock speed.

The image data of the real-time image processing system is very large, and a large amount of intermediate data will also be generated during the image processing process. However, the on-chip resources of the OMAP5910 are limited and cannot accommodate the image data and intermediate data of the frame, so a large amount of image data must be stored in the In the off-chip memory, in order to ensure the real-time performance of the system, the DMA is responsible for completing the movement of data in different storage spaces, without occupying the clock cycle of the CPU, so as to avoid the CPU blocking most of the time in accessing the external memory, and the DMA rearranges the data. The function can optimize the storage of image data in memory, which can not only improve the utilization efficiency of internal storage space, but also improve the data transmission rate.

3 OMAP5910 internal and external memory data exchange analysis

A complete real-time image processing can not only collect images in real time, but also process images in real time. The real-time image processing system mainly consists of image sensor, A/D converter, complex programmable logic device FPGA, OMAP5910 dual-core processor, An image Display device, etc. The main function of the system is that the FPGA receives the 14-bit video signal output by the infrared focal plane array sensor in real time. After downscaling, the DSP processor of the OMAP5910 executes the image processing algorithm. At the same time, the ARM processor of the OMAP5910 executes the complex control instructions, and then the FPGA Buffer, synthesizes 10-bit video signal output by D/A conversion, in addition, the ARM processor of OMAP5910 receives the control instructions of the computer through the interface.

According to the visual requirements of the human eye, the imaging system must collect and process at least 25 frames of image data per second to avoid visual flickering when the image is displayed in real time. For a 320×240 dot matrix image, the A/D is 14 bits, and the data collected per frame is 320×240×14 bits=1 MB. According to the real-time requirements, the speed of processing and displaying data is 320×240×14

Bit×25 frames/s=3.125MB/s, that is, the operation of reading 1 line (320 pixels) and writing 1 line (346 pixels) needs to be completed within 64 μs, and only in this way the image will not lose continuity.

In order to ensure the real-time performance of image processing and display, the data transmission channels of OMAP5910 should be fully utilized, and the data transmission of on-chip and external memory of OMAP5910 is shown in Figure 3. The figure shows all the data when OMAP5910 processes real-time images. Transmission channels, their transmission rates are related to the type of memory. In order to fully understand the performance of the data transmission channel of OMAP5910, the author did a series of experiments, set the system clock to 150MHz, set the working mode to full synchronous mode, and under the premise of enabling Cache, tested each transmission channel in detail to transmit 1 frame. The time used for the data is listed in Table 1. Based on this, data transmission is optimized.

Application design of real-time image processing system based on OMAP5910 dual-core processor

4 DMA mode data transmission optimization scheme

On the basis of analyzing the data transmission rate performance of each data transmission in detail, this paper proposes a data transmission optimization scheme in DMA mode, which divides the whole frame into multiple blocks, and the image data to be processed by OMAP5910 is completely placed in the on-chip data of OMAP5910 The processing in the memory not only reduces a large number of interactions with the external memory, but also makes full use of the high-speed storage resources on the chip. The data transfer between the internal and external memory uses DMA to operate in the background, which greatly improves the work efficiency of the OMAP5910.

4.1 Data Flow

The data sampled by the A/D is first stored in the external buffer. When a certain amount of data is collected, the complex programmable logic device triggers the ARM DMA to read the data, input two frames of images in turn and save them in the SDRAM, and obtain them from the external Flash. The parameter A and B values ​​required for image processing are stored in SDRAM, and the output frame is stored in 8-row block units, triggering DSP DMA to transfer block data from the external buffer area SDRAM of the OMAP5910 to the bidirectional internal buffer area DARAM of the DSP core, for the DSP core to calculate. Regarding the characteristics of DMA transmission, while the DSP core is performing calculations, the DSP DMA transfers the image data (8 lines) of the previous block to the SDRAM. After the ARM core receives the output line data, it triggers the ARM DMA to move the data to the FPGA to control the external storage area. , the data flow is shown in Figure 4.

Application design of real-time image processing system based on OMAP5910 dual-core processor

Because OMAP5910 adopts double buffer mechanism both internally and externally, the DMA processed by the ARM core and DSP core does not affect the transmission of the current frame data while the DMA transmits the previous frame data. In this way, the acquisition of A/D data in the whole system, the transmission of DMA data and the calculation of CPU data have achieved a high degree of parallelism.

4.2 Operation Timing

From the operation timing of data transmission, it can be seen that another advantage of this optimization scheme is to combine several original input processes (each input process refers to the input of 1 line of pixels) into one input process (input several lines at a time) pixels), and disperse the output process that was originally completed centrally into the middle of the input process to further improve the performance.

The specific configuration operation is as follows: the ARM DMA reading process in the OMAP5910 inputs 12 lines of data once, and uses 20 times to input 1 frame of image into the SDRAM. The write process of the ARM DMA starts in the middle of the read process. That is, the addition of the line number and frame number and the DMA write process are started in the interrupt at the end of the first DMA read process in one frame, this time only two lines are written, and the second to the 20th DMA read in one frame In the interrupt at the end of the process, the writing process is started, and 15 lines are written each time. The operation sequence is shown in Figure 5.

Application design of real-time image processing system based on OMAP5910 dual-core processor

The specific time calculation is: reading 12 lines of data is 17.07μs×12=204.84μs, writing 15 lines of data is 27.68μs×15=415.2μs, and the time required for one read, write and interrupt processing is 204.84μS+415.2μs≈700μs , the processing allowable time of one read, write and interrupt is 68μS×12=816μs. After calculating 700μ<816μs, the real-time performance of the real-time image processing system can be guaranteed.

5 Code design and implementation in the optimization scheme

5.1 Main program on ARM side

The ARM core mainly realizes the configuration of the OMAP5910 system, the data input/output of the interface with the FPGA, the data storage with the SDRAM, the control and conversion of the frame mode and other optimization functions.

5.2 Main program of DSP side

DSP core mainly realizes DSP DMA between I SRAM and DARAM

Data input/output, infrared image two-point correction algorithm and defect elimination algorithm, as well as infrared image brightness and contrast parameters calculation and other functions.

5.3 Summary of experience in debugging and configuring DMA channels

1) To test whether the data transfer of the DMA channel is correct, the internal DMA data transfer test program from SDRAM to SDRAM can be written to preliminarily test whether the initial settings of the DMA channel and data transfer are correct.

2) When testing the data transfer rate between the DMA and the FPGA interface, when the oscilloscope reads the read signal or the write signal efficiency, you should pay attention to check whether the number of read signals and write signals is the number of transmitted data.

3) Test the external interrupt of DMA. The pin of the external interrupt is a multiplexed pin, and this pin should be set in advance.

4) Test the mutual cooperation between the read operation and the write operation of the DMA operation, and check whether the output transmission is correct.

5) Use different data in the specified memory to test whether the output of the video image is correct.

6) The trigger event of the DMA interrupt is the rising edge valid.

7) In order to ensure the integrity of the DMA transmission data, the priority of the DMA should be set.

Epilogue

The real-time image processing system realizes the optimization scheme of this data transmission. The speed of real-time image display is 25 frames/s, and the visual effect of the image is ideal. By flexibly controlling the DMA, it can not only improve the transmission efficiency of image data, but also give full play to the High-speed performance of the OMAP5910.

The Links:   2MBI300N-060 NL12876BC26-25A

Related Posts