03-03-2011 12:08 PM
I'm wondering if there is a reference design for offline image processing.
All the designs I found at Avnet's website use the IP blocks for in-line processing which implies that all the operations for the image processing are carried out without accessing the stored frames at the external memory buffer.
We would like to use more than one frame at the time (ideally, 32 frames) for a custom-made close-to-real-time image processing algorithm (offline). We were able to describe a simple function to call up information stored at the buffer but have bumped into few bottle necks.
They can be described as follows
- The max. number of frames is set to 16 (this is a VDMA IP's limitation). No matter the amount of external memory can go well beyond 16 frames. Do you happen to know a way to implement a 32 frames buffer using the current 16-frame VDMA IP? Should we try to have an extra set of VDMA blocks in our design to reach the 32 frames milestone?
- At this point, we are unsure about how the micro-blaze will handle the intermediate results and where it will store them. We are assuming that the MPMC takes control of the memory transfers via the VDMA IP signals and we also assumed that we can allocate a portion of the external memory to store any intermediate product. Is it a correct assumption?
I would appreciate if you could please provide us with a reference design for an off-line image processing setup.
Thanks in advance.
03-04-2011 12:57 PM
The IVK reference designs do not use the VDMA's interrupts,
but you could try to add that functionnality to the design.
This would allow you to manage a "larger" frame buffer in software
by changing the settings of the VDMA after each frame transfer.
The base address of each frame is specified in a unique register
(16 registers if the VDMA is configured for NUM_FRAMES==16).
03-04-2011 03:26 PM
Thanks for your reply!
We have been using the NUM_FRAMES parameter for testing some of our algorithms and found that the software applications somehow hang whenever it is set to be greater or equal to 16. Any hint in that sense?
On the other hand, something I couldn't discern in your reply was an answer to my question related to the microblaze interaction as an outstanding bottleneck. I realize that using interruptions will improve processing efficiency, and since the data and control buses are isolated, things should go much faster. Can you please elaborate a little bit more in that sense?
I appreciate your help.
03-08-2011 08:07 AM
In my latest post, I forgot to ask you for an example using VDMA's interruptions. Do you happen to have one I could use as a reference?
Thanks in advance.
03-15-2011 10:16 AM
I just posted a tutorial on how to add interrupt functionnality to the Video DMA cores on the forum.
Let me know if this answers your question.
03-16-2011 08:39 AM
Thanks a lot for your reply! The set of descriptions for the interruption process using the micro-blaze soft-core seem to be really useful but I'm not quite sure they are directly related to my application's goal.
I perhaps used the wrong set of words to described it. We want to have close to real time processing using live video.
The incoming data (image frames) can't be stoped. The images keep coming into the buffer and they are added to the FIFO and processed as soon as they arrived. The images are coming at a rate of 40MHz (~30 frames-per-second) for a 720p resolution. I belive the micro-blaze processor will not be able to keep up (not even using interruptions) at such rate.
Is there any way I can reach you over the phone? Direct email?
03-16-2011 09:53 AM
In order to acheive 32 frames, you could manually update the 16 base address registers for the VDMAs.
You could use the interrupt handler on the video input side (vdma_0) to maintains a 32 bit counter.
The interrupt handler could implement a scheme where you update a sub-set of the 16 regs in a circular fashion.
It could also signal the application when processing can start.
If you need all 32 frames for your processing, then you will need to stop the acquisition.
Otherwise, you will need to implement twice the number of frames (ie. 64 frames) and manage them in a ping/pong fashion.
This assumes that the processing of 32 frames takes less than 1 sec (ie. less than the 32 frames / 30 frames/sec = 1.06sec).
Otherwise, you will need to stall the acquisition until the processing is done ...
P.S.: I sent you a PIM