After a careful reading of the comments by Gunnam et al., [1] we identified two main points to be further discussed hereafter.
1.1. Point 1: Cited Papers
Gunnam et al. claim that we did not cite correctly their work [2] and refer to other four publications of their own to provide further explanation. Actually the introductory section of our work [3] aims at providing an overview of the state-of-the-art architectures on the subject. The five works by Gunnam et al. basically propose the same LDPC architecture where the description of all the features is spread across the five publications. As a matter of fact, to be fair and balanced with the other state-of-the-art architectures we have decided to cite only one of their works and particularly the one providing the most details regarding the architecture and the implementation results [2]. Finally the selected paper was correctly cited in our work [3] with no misleading information or wrong assertion regarding the architecture described by Gunnam et al..
1.2. Point: Architectural Efficiency
In our paper [3], we defined a metric to compare the efficiency of different LDPC architectures in terms of (average) number of clock cycles per block and per iteration, with the term "block" referring to a circulant of the parity check matrix. We applied this metric to our design as well as to other available implementations including [2], in this process, we used the figures of throughput reported in each referenced paper.
Gunnam et al. claim that this is not a fair metric because it involves the average number of iterations. Actually we hardly understand the point arisen. On one hand, it is common practice referring to the average number of iterations to express the system throughput. On the other hand, Gunnam et al. themselves use in [2] the average number of iterations to evaluate their throughput figures. Moreover Gunnam et al. state that the overhead of the statistical buffering has not been taken into account. Although there is no mention of the statistical buffering within the cited paper [2], this does not affect the system throughput but rather the decoding latency. Summarizing, we are quite confident regarding the fairness of the considered Architectural Efficiency metric and of the data provided in our paper.