As I write this, I just noticed only two weeks (plus some more days) left in the official coding period of Google Summer of Code, some much time has already passed, hoping to complete the milestones by that time. This week’s focus was to first implement a static mixing of inputs, and once that is figured out in hardware, change the multiplier value from firmware to do dynamic mixing.
This work pertains to work done from 23rd July to 29th July.
Adding modules to Video Pipeline:
Last week I was adding my defined modules for float arithmetic the input video pipeline, that is in the hdmi_in files. Though this was a good test to check the working, the modules are supposed to be added to output pipeline. (gateware/hdmi_out/phy.py).
Major tasks done this week,
- Fixed bug that causes missing color in gradients: This was a bug in floating point multiplier unit which was causing random colors in the gradient to be missing. This wasn’t spotted in simulation before as dynamic testing with different inputs wasn’t done for testing floatpoint units. This bug was basically because of error in pipeline, stage 5 was copying a value from stage 3 instead of stage 4. A bit difficult to figure out this error, because when a lot of data is going in the pipeline, output is not very easily decipherable. What I noticed was that frac of current output was dependent on frac of previous outputs, And after that it was really easy to fix this.
- Adding mixer block in hardware, with input to adder and mult hardwired
Once the modules seemed to work perfectly in simulation, all the modules were added together with same layout (other inputs of add and mult hardwired to zero and one respectively). In this case the layout at floatadd and floatmult is always, rgb16f_layout. This is good, because, same layout means that we can easily connect using Record.connect() method. This was tested to work perfectly. Next task was to check if mixing was done correctly, for that I needed to connect modules with different layouts.
- Figure out connecting blocks with different layouts:
For connecting two PipelinedActor modules we generally use Record.connect() method which by default connects all the other signals (like ack, stb ) apart from payload signals. I first tried to define two sinks for floatadd module, this would have been perfect, but I encountered an error. After a lot of asking around and not getting the exact answer on how to proceed, I decided to dive myself into the libraried that define these classes and methods. I mainly focused on how Record.connect() was implemented and stuff the class the inputs were derived from.
So apart from the usual payload signals, the source and sink of a pipelined module have four other signals defined, which control the flow of packet from source of master to sink of slave. These signals are namely stb, ack, sop, eop. Information about stb and ack can be found here. The other two, sop and eop, refer to start of packed and end of packet. So we basically want to know the equivalent connections for Record.connect()
# This is the Record.connect() method Record.connect(ycbcr2rgb.source, rgb2rgb16f.sink), def rgb_layout(dw): return [("r", dw), ("g", dw), ("b", dw)] # This is alternate way of doing equivalent connections rgb2rgb16f.sink.r.eq(ycbcr2rgb.source.r), rgb2rgb16f.sink.g.eq(ycbcr2rgb.source.g), rgb2rgb16f.sink.b.eq(ycbcr2rgb.source.b), rgb2rgb16f.sink.stb.eq(ycbcr2rgb.source.stb), ycbcr2rgb.source.ack.eq(rgb2rgb16f.sink.ack), rgb2rgb16f.sink.sop.eq(ycbcr2rgb.source.sop), rgb2rgb16f.sink.eop.eq(ycbcr2rgb.source.eop),
More information about significance and implementation of these signals is in this document (page 23).
- The rgb16f2rgb, unit didn’t have a mechanism to treat overflows of float values correctly. When loaded with float values greater than 1.0, the output was coming out as 0, but it is supposed to saturate to 255. This was changed in rgb16f2rgb file, by adding a simple condition at the output.
- While, adding eq statements for Record.connect() equivalence, as discussed above, I was adding the .eq statements in opsis_video.py file, but didn’t seem to working. After about a day of random tries here and there, I found out that I hadn’t wrapped .eq statements in a self.comb += . Caused a lot of delay because, each compilation cycle in hardware took about 20 minutes and that made it very difficult to spot the error.
Things seemed to be mixing well after this. Though there is some alignment problem due to timing inconsistency, which I need to figure out. For static testing I hardwired the mult with 0.5 value and connected output from HDMI_OUT0 and HDMI_OUT1 to floatadd unit of HDMI_OUT0. Later added a CSRStorage register, to dynamically vary the multiplier value from firmware to create a fade like effect. Added supporting functions to ci.c and other functions for maintaining timing.