This document describes a proposed low-power CORDIC-based DCT architecture that prioritizes processing of low-frequency DCT coefficients over high-frequency coefficients to reduce power consumption with minimal image quality degradation. It uses a look-ahead CORDIC approach to allow varying the number of CORDIC iterations for different coefficients. Experimental results show the proposed architecture achieves 38.1% area and power savings compared to DA-based DCT, with comparable power to MCM-based DCT but using 100% less area and a minor 0.04dB quality loss.
Reconfigurable CORDIC Low-Power Implementation of Complex Signal Processing for Reducing Power Dissipation
1. Scientific Journal Impact Factor (SJIF): 1.711
International Journal of Modern Trends in Engineering
and Research
www.ijmter.com
@IJMTER-2014, All rights Reserved 291
e-ISSN: 2349-9745
p-ISSN: 2393-8161
Reconfigurable CORDIC Low-Power Implementation of Complex
Signal Processing for Reducing Power Dissipation
K.Srinivasan1
, S.Sathyamoorthy2
II-M.E(CS)1
, AP / ECE2
, Dhanalakshmi Srinivasan Engineering College, Perambalur.
Abstract— In recent years, CORDIC algorithms has been used extensively for various image processing
system& biomedical applications. By using CORDIC algorithm we can able to reducing the number of
iteration to process the image in the system. Low power design is to be challenging process during system
operations. Previous approaches scope to minimize the power consumption without image quality
consideration. In this paper CORDIC Based Low Power DCT iterations process equally based upon their
image quality. An hardware implementation of ROM & control logic circuit to require large hardware
space in this system. Look-ahead CORDIC Approach is used to finish the iteration at one time. When
reducing hardware area & reducing number of iterations for maximize battery lifetime. This idea used to
achieve the low power design of image and video compression application
Index Terms—CO-ordinate Rotation Digital Computer(CORDIC), look-ahead, datapaths, Low-power,
Discrete Cosine Transform(DCT)
I. INTRODUCTION
With the explosive growth of multimedia system services running on moveable application, the demand
for low-power implementations of advanced signal process algorithms is enormously increasing. the
foremost vital a part of multimedia system systems square measure the applications involving image and
video process, that square measure terribly computationally intensive and so ought to be enforced with
low price due to the restricted battery period of time of moveable devices. several previous analysis efforts
square measure targeted on reducing power dissipation of image and video applications. Especially, low-
power style of separate trigonometric function rework (DCT) has been of specific interest, since DCT is
one amongst the foremost computationally intensive operations in video and compression, and it's wide
adopted in several standards like JPEG , MPEG, and H.264.
Since first planned in 1959 , coordinate rotation data processor system} (CORDIC) has been wide
accustomed calculate the pure mathematics functions in signal processing applications, like QR
decomposition, quick Fourier rework, singular price decomposition then on. Since CORDIC will be
merely enforced with the unvaried operations of additives and shifts, it's been wide used for the number
less low-power DCT architectures. several previous analysis works centered on reducing the hardware
quality of DCT like distribute arithmetic (DA)-based DCT and multiple constant multiplication (MCM)-
based approach. though bit-serial DA-based approach offers an everyday and straightforward DCT design,
massive hardware space is required for bit-parallel operations owing to further ROMs and management
logics. MCM-based DCT will be merely enforced with a smaller variety of shift-and-add operations,
however, to create a trade-off between the image quality and computation energy, the computation sharing
in numerous knowledge ways ought to be utterly re-considered. For the low-power CORDIC-based DCT
design conferred in knowledge correlations between neighboring pixels area unit with efficiency
2. International Journal of Modern Trends in Engineering and Research (IJMTER)
Volume 01, Issue 06, [December - 2014] e-ISSN: 2349-9745, p-ISSN: 2393-8161
@IJMTER-2014, All rights Reserved 292
accustomed skip the interior CORDIC iterations. Approximation technique or incorporating compensation
steps into the division is additionally exploited to scale back the ability consumption of CORDIC-based
DCT design. Most of the previous analysis works area unit in the main centered on reducing the amount
of arithmetic units; the inherent knowledge priorities in DCT coefficients, however, haven't been exploited
within the CORDIC-based DCT.
In DCT, all the computations don't seem to be equally necessary in generating the frequency domain
outputs (DCT coefficients). In alternative words, a number of the computations in DCT area unit important
for determinant the output image quality, whereas others play comparatively minor roles. This attention-
grabbing property is wont to offer the correct trade off between the output image quality and power
dissipations. during this paper, we have a tendency to gift a low-power CORDIC-based DCT design,
wherever the necessary variations among the DCT coefficients area unit with efficiency exploited to
realize the facility savings minimum image quality degradation. to use the priority-based processing, look
ahead CORDIC architectures area unit adopted to beat the inherent data-dependencies in the traditional
CORDIC design. Thus, the amount of CORDIC iterations is dynamically controlled considering the
importance of DCT coefficients by that appreciable power savings is achieved.
The rest of this paper is organized as follows. the fundamentals of CORDIC rule and also the typical
CORDIC-based DCT area unit bestowed in Section II. The planned low-power CORDIC-based DCT
design and its hardware implementation area unit bestowed in Section III. supported the planned DCT
design, a reconfigurable CORDIC-based DCT is bestowed in Section IV. Finally, conclusions area unit
drawn in Section V.
II. SYSTEM MODEL
A. CORDIC Architecture
The basic principal of CORDIC is to iteratively rotate a vector using a rotation matrix, which is
represented as follows:
ݔ
ݕ
ݖ
൩ =
ݔିଵିߪ2ଵି
ݕିଵ
ݕିଵ + ߪ2ଵି
ݔିଵ
ݖିଵ − ߪߙ
where x and y are the vector coordinate components of x and y axes, respectively, i is the i th iteration
step, ߪ is the sign-bit that can be +1 or −1 indicating the direction of the vector rotation, z is the
accumulated rotation angle, and α is the predefined angle value of each micro rotation step, In the
CORDIC architecture, the amplitude and argument of a given vector can be calculated using the vectoring
mode, while the sine and cosine values of the given angle are obtained with the rotation mode.
1) Look-ahead CORDIC Approach: Within the CORDIC equation shown , to calculate the output of
this stage, the results from the previous stage iterations ought to be computed initial. These
knowledge dependencies square measure the most performance bottleneck within the typical
CORDIC hardware. to induce over the information dependencies, look-ahead CORDIC is
developed, wherever look-ahead implies that variety of CORDIC iterations may be computed
ahead to complete the iterations at just one occasion. associate degree example of four-iteration
step look-ahead CORDIC. it's noteworthy that if the sign-bits square measure proverbial ahead,
the subsequent stage iterations may be directly computed exploitation the input vectors of this stage
iteration while not computing the intermediate results.
2) Scale-Factor in CORDIC Operations: In the CORDIC operation, the magnitude of the rotated
vector is scaled and accumulated after every iteration according to the following equation:
3. International Journal of Modern Trends in Engineering and Research (IJMTER)
Volume 01, Issue 06, [December - 2014] e-ISSN: 2349-9745, p-ISSN: 2393-8161
@IJMTER-2014, All rights Reserved 293
ܭ =
1
ඥ1 + 2ଶ(ଵି)
B. CORDIC-Based DCT Architecture
The 2-D DCT process is decomposed into an 1-D DCT (row DCT) followed by another 1-D DCT (column
DCT), which is expressed as the following equation
ܻ = ܶܶݔ்
= ܶ(ܶݔ்
)்
where x and Y are 8 × 8 size of image data matrix and 2-D DCT transformed output matrix,
respectively. T is the 8 × 8 1-D DCT basis matrix.
x X=xcos(ɵ) +ysin(ɵ)
y Y=xsin(ɵ)+ycos(ɵ)
After 2-D DCT operation, the computer file in house domain is remodelled to the frequency domain,
that is that the eight × eight block of sixty four DCT coefficients shown in Fig. 4. Here, as DCT has the
signal compaction property, the signal energy of the output knowledge (DCT coefficients) is generally
targeting many low-frequency elements, whereas the opposite higher frequency elements square measure
related to tiny signal energy. The high-frequency DCT coefficients become even smaller once the division
step [5], which implies that the low-frequency elements (DC) square measure a lot of sensitive to human
eyes than high-frequency elements.
The main plan during this paper is predicated on the actual fact that low-frequency DCT coefficients
square measure comparatively a lot of necessary than high-frequency coefficients. Our CORDIC-based
DCT design is intended considering the importance differences between the low and high-frequency DCT
coefficients. Generally, because the a lot of variety of iterations is performed in CORDIC, the a lot of
correct results square measure obtained. There-fore, within the projected DCT design, a bigger variety of
CORDIC iterations square measure allotted to come up with the low-frequency DCT coefficients, whereas
the comparatively smaller variety of iterations square measure used for the high-frequency elements. the
quantity of CORDIC iterations is judiciously designated specified the image quality degradation owing to
the smaller iterations may be reduced. elaborate explanations on the DCT hardware are conferred within
the following sections.
CORDIC
Algorithm
4. International Journal of Modern Trends in Engineering and Research (IJMTER)
Volume 01, Issue 06, [December - 2014] e-ISSN: 2349-9745, p-ISSN: 2393-8161
@IJMTER-2014, All rights Reserved 294
III. PRIORITY-BASED LOW-POWER DCT ARCHITECTURE USING LOOKAHEAD
CORDIC APPROACH
A. Data Priority Considered Look-ahead CORDIC Architecture
In the typical CORDIC structure , because of the crossing-data path, ever-changing the quantity of
iterations for 2 separate CORDIC datapaths isn't possible. To assign totally different variety of iterations
to the 2 CORDIC datapaths, we tend to adopt the look-ahead CORDIC approach within the projected
DCT design.
Assuming that if the CORDIC results need four iterations for x whereas 3 iterations square measure
required for y, the look-ahead CORDIC equation for each results is expressed as follows, which
suggests that we are able to single.
The distinction between the standard crossing CORDIC design and therefore the look-ahead-based
approach. once the look-ahead approach is applied to the CORDIC design, the quantity of iterations is
simply controlled as all the inner datapath become freelance
In the projected CORDIC-based DCT design, wherever a unique variety of iterations square measure
allotted for generating DCT coefficients, the quantity of iterations ought to be rigorously set to attenuate
the error between the required input angle and also the corresponding accumulated angle. Table I shows
the iterations dead at i th stages and also the corresponding rotation direction σ (sign-bits). for instance, to
rotate the vector by π/16, solely the i th iterations (i = zero, 1, 3, 10) square measure dead and also the
remainder of the iterations is skipped for power savings.
In our DCT, the iterations to be skipped square measure rigorously elite specified the error between
the required angle and also the corresponding accumulated angle doesn't exceed zero.004 for all the given
angles. the quantity of CORDIC iterations for combination accustomed derive look-ahead CORDIC
algorithmic program is set mistreatment package modeling method bestowed in Section III-C.
As mentioned in Section II-A2, the scale-factor is set per the quantity of the dead CORDIC iterations.
because the variety of iterations is thought ahead, the scale-factors square measure planned, that square
measure shown in Table II. within the table, the scale-factors square measure delineate as signed power
of 2 format, and also the quantization error of the scaling issue is below 10E −4.
One fascinating observation once the look-ahead approach is applied to CORDIC is that removing
high shift-terms has the similar impact with the look-ahead CORDIC exploitation less variety of iterations.
as an example, if the CORDIC rotation with π/16 is dead exploitation 3 iterations (i = 0, 1, 3). Please note
that the amount of CORDIC iterations are often merely controlled by removing the high shift-terms.
B. Proposed Low-Power CORDIC-Based DCT Architecture
As mentioned within the last a part of Section III-A, considering the information priorities in DCT
constant, high shift-term of the look-ahead CORDIC are often fastidiously removed, that has constant
impact with the less variety of CORDIC iterations. as a result of the less variety of CORDIC iterations
5. International Journal of Modern Trends in Engineering and Research (IJMTER)
Volume 01, Issue 06, [December - 2014] e-ISSN: 2349-9745, p-ISSN: 2393-8161
@IJMTER-2014, All rights Reserved 295
means that the CORDIC with low procedure complexness, a low-power CORDIC-based DCT design are
often derived and its careful implementation is as follows.
Inside the CORDIC module, the look-ahead CORDIC comes exploitation the parameters. The scale-
factors also are such that. associate example of the look-ahead CORDIC formula for 7π/16 rotation and
also the corresponding scale-factors square measure given within the equations shown. to scale back the
amount of iterations, the high shift-terms square measure removed as given in Section III-A. we have a
tendency to any scale back the reduced parts considering the information priorities in DCT coefficients.
In the projected hardware design, all the shift parts for every of look-ahead CORDIC formula and also
the scale-factors square measure precomputed exploitation the look-ahead CORDIC equations. within the
numbers within the circle represent the shift operation, and also the black color chart means that the 2’s
complement components of the shifted element, that square measure used for calculate operations. The
line represents the omitted computations, thus, the 2 ends up in look-ahead CORDIC modules have the
various variety of terms, that results in power savings as a result of the smaller variety of iterations.
C. Experimental Results of the Proposed Low-Power CORDIC-Based DCT Architecture
In this section, the experimental results of the projected CORDIC-based DCT design are given. First,
the quantity of CORDIC iterations is determined in line with the target PSNR of thirty one.5 dB, that is
that the average PSNR obtained mistreatment 9 benchmark pictures listed in Table IV. PSNRs of the
benchmark pictures are obtained mistreatment the subsequent equation
PSNR = 20. logଵ(
255
√MSE
)
ܧܵܯ =
1
݉݊
[ܫ(,ݔ ݕ) − ܭ(,ݔ ݕ)]ଶ
ିଵ
௬ୀ
ିଵ
௫ୀ
where I is m × n size of original image, and K is the reconstructed image.
For comparisons, varied DCT architectures like DA-based DCT, MCM, CORDIC-based DCT, and
CORDIC-based Loeffler DCT, are enforced mistreatment 0.13 µm CMOS galvanic cell library. The
enforced 2-D DCT is given with a line in Fig. 2, and Table III shows the implementation results. within
the table, power consumptions for various DCT architectures are measured mistreatment nanosim with a
hundred rate clock cycles, 1.2 V provide voltage over 500 input vectors are wont to acquire the typical
power. Compared with the DA-based design, the projected DCT shows 38.1% of space and pure gold
power savings. Compared with the MCM-based DCT , the projected DCT shows comparable power
consumption and 100 percent smaller space with a minor image quality degradation of 0.04 dB as a result
of a number of the upper order shift-terms in CORDIC iterations will be removed considering the
importance differences of DCT coefficients, our projected DCT design shows all-time low gate count and
power consumption compared with different CORDIC-based architectures. Especially, the projected DCT
design shows 21.87% of power savings compared to the CORDIC-based Loeffler DCT with even higher
PSNR results.
6. International Journal of Modern Trends in Engineering and Research (IJMTER)
Volume 01, Issue 06, [December - 2014] e-ISSN: 2349-9745, p-ISSN: 2393-8161
@IJMTER-2014, All rights Reserved 296
IV. RECONFIGURABLE CORDIC-BASED DCT ARCHITECTURE
A. Proposed Reconfigurable Low-Power CORDIC-Based DCT Architecture
Using the low-power DCT design bestowed within the previous section, to additional cut back the
facility consumption at the expense of a minor image quality degradation we have a tendency to propose
a reconfigurable CORDIC-based DCT design during this section. many trade-off modes square measure
bestowed, and therefore the projected reconfigurable design will dynamically modification the CORDIC
iterations to adaptively trade off the computation energy for the image quality within the same hardware.
Generally, within the look-ahead CORDIC, the shift-terms for scheming low-frequency DCT
coefficients (terms for scheming X (0), X (1) in (8)) square measure a lot of vital than the shift-terms for
scheming high-frequency coefficients. in addition, among the shift-terms in one look-ahead CORDIC
equation, the foremost vital terms square measure low shift-terms whereas the comparatively minor terms
square measure high shift-terms. to save lots of the computation power at the expense of minimum image
quality degradation, first, the smallest amount vital shift-term in X(7) is removed supported Greedy
algorithmic program. Again, we have a tendency to explore for succeeding least vital shift-term to cancel
the computation. As we have a tendency to repeat the method, the a lot of range of shift-terms square
measure removed, which suggests that the computation power is reduced with minimum image quality
degradation.
With the approach shown in Fig. 7, we have a tendency to propose three modes of trade-off levels:
traditional mode, and modes one and a couple of. As we have a tendency to visit the upper trade-off levels
(sacrificing the image quality in favor of lower power), the quantity of shift-terms composing look-ahead
CORDIC equations is reduced. Table IV shows the PSNR results of the benchmark pictures for 3 trade-
off levels. The image quality constraints for traditional mode, mode 1, and mode two area unit geared
toward the common PSNR of 31.5, 30, and 27 dB, severally, for 9 benchmark pictures. the quantity of
trade-off modes and therefore the minimum allowable PSNRs may be modified in line with the user’s
selection.
The number of shift-terms in the look-ahead CORDIC equation and the scaling factors for three different
modes of operations. As an example, to calculate
ܺଷగ ଵ∗⁄ = (1 − 2ିସ)ݔ + (−2ିଵ
− 2ିଷ)ݕ
ܺగ ଵశ⁄ = (1 + 2ିଵ)ݔ + (2ିଶ
)ݕ
B. Hardware Implementation of the Reconfigurable DCT and Experimental Results
The image quality and machine energy trade-off approach projected within the previous section may
be realized as a reconfigurable hardware victimization the DCT design shown. At traditional mode of
operation, the low-power DCT design in Section III-B is employed.
For different trade-off modes, the projected DCT design may be dynamically reconfigured by merely
dynamical the management signal ф to trade-off minor image quality for consumption energy. Once a
7. International Journal of Modern Trends in Engineering and Research (IJMTER)
Volume 01, Issue 06, [December - 2014] e-ISSN: 2349-9745, p-ISSN: 2393-8161
@IJMTER-2014, All rights Reserved 297
trade-off mode is decided, the management signal ф controls the turnoff gate arrays for of the CORDIC
equations terms & the scaling terms.
The power consumption is measured with nanosim with 100-MHz clock cycles, 1.2 V offer voltage.
The PSNR within the Table V shows the common PSNR of nine benchmark pictures. As shown within
the table, the projected design offers vital power savings as image quality decreases. Compared with the
traditional mode, mode two provides 38.73% of power savings with the image quality degradation.
Compared with the CORDIC-based Loeffler DCT that was shown in Table III, the projected design shows
45.3% of power savings at mode one at the expense of 0.52-dB image quality degradation. At exchange
level two, the projected DCT design achieves up to 59.5% of power savings compared with the standard
CORDIC-based DCT with appreciable image quality degradations.
V. PERFORMANCE RESULTS
In this section we present a set of performances results, This method reduces the total computation
power energy. The proposed multiplier less CORDIC-based DCT architecture produces high throughput
and is easy to implement in VLSI.
In Fig A & B shows that we enter the value in format of binary number in Force Constant Value
Fig.A Assigning of the input value to determine angle value
Fig.B Enter the input value in Force Constant Value
There will be the input value from in1[3:0] to in8[3:0] in Simulate Behavioral Window. The binary
value assign in the Force Constant Value.
The best approach of CORDIC algorithm, it can be concluded that the propose algorithm is more
8. International Journal of Modern Trends in Engineering and Research (IJMTER)
Volume 01, Issue 06, [December - 2014] e-ISSN: 2349-9745, p-ISSN: 2393-8161
@IJMTER-2014, All rights Reserved 298
efficient than state-of-the-art CORDIC algorithm. In Fig C shows that the output value of the input data
applied in it
We don’t consider small varied value appeared during the time of iterations with CORDIC
algorithms & assign the intermediate value in between input value and the output value. Coding
efficiency using different accuracy both in rotation angle approximation of the rotators and the bit
accuracy of the adders in butterfly operators and rotators) for various DCT-based applications.
It has a much shorter latency which maintaining high image quality and minimize hardware complexity
that simultaneously satisfies low-energy requirements
Fig C Output appear on the screen
VI. CONCLUSIONS
In this paper We using this CORDIC algorithm with DCT to minimize the number of iterations based
upon the image quality and reducing hardware complexity present in the system.
The data correlation used to decrease the internal iteration between neighboring pixels in the images.
Reconfigurable CORDIC Low Power using DCT method can effectively processed to achieve power
saving in this system. In this method to achieve power consumption without any image degradation.
REFERENCES
[1] A. Bahari, T. Arslan, and A. T. Erdogan, “Low-power H.264 video compression architectures for mobile communication,”
IEEE Trans. Circuits Syst. Video Technol., vol. 19, no. 9, pp. 1251–1261, Sep. 2009.
[2] H. Jeong, J. Kim, and W. Cho, “Low-power multiplierless DCT architectureusing image data correlation,” IEEE Trans.
Consum. Electron.vol. 50, no. 1, pp. 262–267, Feb. 2004.
[3] Z. Wu, J. Sha, Z. Wang, and L. Li, “An improved scaled DCT architecture,” IEEE Trans. Consum. Electron., vol. 55, no.
2, pp. 685–689, May 2009.
[4] S. Hsiao, Y. Hu, T. Juang, and C. Lee, “Efficient VLSI implementations of fast multiplierless approximated DCT using
parameterized hardware modules for silicon intellectual property design,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 52,
no. 8, pp. 1568–1579, Aug. 2005.
[5] S. Yu and E. E. Swartziander, “DCT implementation with distributed arithmetic,” IEEE Trans. Comput., vol. 50, no. 9,
pp. 985–991, Sep. 2001.
[6] G. Karakonstantis, N. Banerjee, and K. Roy, “Process-variation resilient and voltage-scalable DCT architecture for robust
low-power computing,”IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 18, no. 10,pp. 1461–1470, Oct. 2010.
9. International Journal of Modern Trends in Engineering and Research (IJMTER)
Volume 01, Issue 06, [December - 2014] e-ISSN: 2349-9745, p-ISSN: 2393-8161
@IJMTER-2014, All rights Reserved 299
[7] J. Park and K. Roy, “A low power reconfigurable DCT architecture to trade off image quality for computational
complexity,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., May 2004, pp. 17–20.
[8] J. Park, J. H. Choi, and K. Roy, “Dynamic bit-width adaptation in DCT: An approach to trade off image quality and
computation energy,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 18, no. 5, pp. 787–793, May 2010.
[9] J. Li, “Sign lookahead CORDIC,” M.S. thesis, Dept. Electr. Eng., Nat. Cheng Kung Univ., Tainan, Taiwan, 2008.
[10] T. Liu, T. Lin, S. Wang, and C. Lee, “A low-power dual-mode video decoder for mobile applications,” IEEE Commun.
Mag., vol. 44, no. 8, pp. 119–126, Aug. 2006.