SlideShare uma empresa Scribd logo
1 de 23
Baixar para ler offline
B. Baas 442
FFT
• There are many ways to decompose an FFT
[Rabiner and Gold]
• The simplest ones are radix-2
• Computation made up of radix-2 butterflies
X = A + BW
Y = A – BW
A
B
B. Baas 443
FFT Dataflow Diagram
• Dataflow diagram
– N = 64
– radix-2
– 6 stages of
computation
Memory
Locations
0
4
8
12
16
20
24
28
32
36
40
44
48
52
56
60
63
Input Output
B. Baas 444
Radix 2, 8-point FFT
B. Baas 445
Radix 2, 8-point FFT
Radix 2, 16-point FFT
B. Baas 446
Radix 2, 8-point FFT
Radix 2, 32-point FFT
B. Baas 447
Radix 2, 8-point FFT
Radix 2, 64-point FFT
B. Baas 448
Radix 2, 256-point FFT
Radix 2, 256-point FFT
B. Baas 449
Radix 4, 16-point FFT
B. Baas 450
Radix 4, 64-point FFT
B. Baas 451
Radix 4, 256-point FFT
B. Baas 452
Radix 2, Decimation-In-Time (DIT)
• Input order “decimated”—needs bit reversal
• Output in order
• Butterfly:
X = A + BW
Y = A – BW
A
B
x
car-sav add
+
A
B W
Critical path:
B. Baas 453
Radix 2, Decimation In Frequency
(DIF)
• Input in order
• Output “decimated”—needs
bit reversal
• Butterfly:
– Two CPAs
– Wider multiplier
X = A + B
Y = (A – B) W
A
B
x
+
+ (–)
B
Critical path:
A
W
B. Baas 454
Radix 4, DIT Butterfly
• Decimation in Time (DIT) or Decimation in
Frequency (DIF)
W
X
B
C
D
A
Y
V
Bit-Reversed Addressing
• Normally:
– DIT: bit-reverse inputs before processing
– DIF: bit-reverse outputs after processing
• Reverse addressing bits for read/write of data
– 000 (0)  000 (0) # Word 0 does not move location
– 001 (1)  100 (4) # Original word 1 goes to location 4
– 010 (2)  010 (2) # Word 2 does not move location
– 011 (3)  110 (6) # Original word 3 goes to location 6
– 100 (4)  001 (1) # Original word 4 goes to location 1
– 101 (5)  101 (5) # Word 5 does not move location
– 110 (6)  011 (3) # Original word 6 goes to location 3
– 111 (7)  111 (7) # Word 7 does not move location
BIT
REVERSAL
BIT
REVERSAL
B. Baas 455
B. Baas 456
Addressing In Matlab
(Especially helpful for FFTs)
• Matlab
– Matlab can not index arrays with index zero!
• In matlab, do address calculations normally
– AddrA = 0, 2, 4, …
AddrB = 1, 3, 5, …
• then use pointers with an offset of one whenever
indexing arrays
– AddrA = ……;
AddrB = ……;
A = data(AddrA+1);
B = data(AddrB+1);
…
data(AddrA+1) = X;
data(AddrB+1) = Y;
B. Baas 457
Higher Radices
• Radix 2 and radix 4 are certainly the most popular
• Radix 4 is on the order of 20% more efficient than
radix 2 for large transforms
• Radix 8 is sometimes used, but longer radix
butterflies are not common because additional
efficiencies are small and added complexity is non-
trivial (especially for hardware implementations)
B. Baas 458
I. Common-Factor FFTs
• Key characteristics
– Most common class of FFTs
– Also called Cooley-Tukey FFTs
– Factors of N used in decomposition have common factor(s)
• A) Radix-r
– N = rk, where k is a positive integer
– Butterflies used in each stage are the same
– Radix-r butterflies are used
– N/r butterflies per stage
– k = logr N stages
B. Baas 459
I. Common-Factor FFTs
• B) Mixed-radix
– Radices of component butterflies are not all equal
– More complex than radix-r
– Is necessary if N ≠ rk
– Example
• N = 32
• Could calculate with two radix-4 stages and one radix-2 stage
B. Baas 460
II. Prime-Factor FFTs
• The length of transforms must be the product of
relatively prime numbers
• This can be limiting, though it is often possible to
find lengths near popular power-of-2 lengths
(e.g., 7 x 11 x 13 = 1003)
• Their great advantage is that they have no WN
twiddle factor multiplications
• Irregular sorting of input and output data
• Irregular addressing for butterflies
B. Baas 461
III. Other FFTs
• Split-radix FFT
– When N = pk, where p is a small prime number and k is a
positive integer, this method can be more efficient than
standard radix-p FFTs
– “Split-radix Algorithms for Length-pm DFT’s,” Vetterli and
Duhamel, Trans. on Acoustics, Speech, and Signal Processing,
Jan. 1989
X
Y
A
B
V
W
C
D
Wa
Wb
B. Baas 462
III. Other FFTs
• Winograd Fourier Transform Algorithm (WFTA)
– Type of prime factor algorithm based on DFT building
blocks using a highly efficient convolution algorithm
– Requires many additions but only order N multiplications
– Has one of the most complex and irregular structures
• FFTW (www.fftw.org)
– C subroutine libraries highly tuned for specific architectures
• Goertzel DFT
– Not a “normal” FFT in that its computational complexity is
still order N2
– It allows a subset of the DFT’s N output terms to be
efficiently calculated
B. Baas 463
Signal Growth
• Note in DFT equation signal can grow by N times
• This is also seen in the FFT in its growth by r times in
a radix-r butterfly, and logrN stages in the entire
transform: r ^ (logrN) = N
• Thus, the FFT processor requires careful scaling
– Floating point number representation
• Easiest conceptually, but expensive hardware. Typically not
used in efficient DSP systems.
– Fixed-point with scaling by 1/r every stage
• First stage is a special case. Scaling must be done on the inputs
before processing to avoid overflow with large magnitude
complex inputs with certain phases.
– Block floating point
B. Baas 464
Efficient Computation of the IFFT
• out = IFFT(in)
• 0) Design a separate processor for IFFTs
• Re-use a forward FFT engine if available
– 1) Swapping real and imaginary parts:
a = fft(imag(in) + i*real(in));
out = (imag(a) + i*real(a));
– 2) Using conjugates:
a = fft(conj(in));
out = conj(a);
– 3) A simple indexing change:
a = fft(in);
out = [a(0) a(N-1:-1:1)]; % with normal indices
out = [a(1) a(N :-1:2)]; % with weird matlab indices

Mais conteúdo relacionado

Semelhante a Handout_fft_see_this.pdf Fast forrier transform

11 instruction sets addressing modes
11  instruction sets addressing modes 11  instruction sets addressing modes
11 instruction sets addressing modes
Sher Shah Merkhel
 
11 instruction sets addressing modes
11  instruction sets addressing modes 11  instruction sets addressing modes
11 instruction sets addressing modes
Kanika Thakur
 

Semelhante a Handout_fft_see_this.pdf Fast forrier transform (20)

Graph partition 2
Graph partition 2Graph partition 2
Graph partition 2
 
Lec6 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Instruction...
Lec6 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Instruction...Lec6 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Instruction...
Lec6 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- Instruction...
 
Lec3 instructions branch carl hamcher
Lec3 instructions branch carl hamcher Lec3 instructions branch carl hamcher
Lec3 instructions branch carl hamcher
 
IDEA.ppt
IDEA.pptIDEA.ppt
IDEA.ppt
 
Introduction to 8086 microprocessor
Introduction to 8086 microprocessorIntroduction to 8086 microprocessor
Introduction to 8086 microprocessor
 
Integrated Register Allocation introduction
Integrated Register Allocation introductionIntegrated Register Allocation introduction
Integrated Register Allocation introduction
 
8086 Instruction set
8086 Instruction set8086 Instruction set
8086 Instruction set
 
File 10 - CSX 334 _VRA NBO.ppsx
File 10 - CSX 334 _VRA NBO.ppsxFile 10 - CSX 334 _VRA NBO.ppsx
File 10 - CSX 334 _VRA NBO.ppsx
 
Classless inter domain routing
Classless inter domain routingClassless inter domain routing
Classless inter domain routing
 
FPGA_Logic.pdf
FPGA_Logic.pdfFPGA_Logic.pdf
FPGA_Logic.pdf
 
Single instruction multiple data
Single instruction multiple dataSingle instruction multiple data
Single instruction multiple data
 
Fuzzy arithmetic
Fuzzy arithmeticFuzzy arithmetic
Fuzzy arithmetic
 
Analog to Digital Converters and Data Acquisition Systems
Analog to Digital Converters and Data Acquisition SystemsAnalog to Digital Converters and Data Acquisition Systems
Analog to Digital Converters and Data Acquisition Systems
 
Logsv2
Logsv2Logsv2
Logsv2
 
Fixed Point Conversion
Fixed Point ConversionFixed Point Conversion
Fixed Point Conversion
 
11 instruction sets addressing modes
11  instruction sets addressing modes 11  instruction sets addressing modes
11 instruction sets addressing modes
 
11 instruction sets addressing modes
11  instruction sets addressing modes 11  instruction sets addressing modes
11 instruction sets addressing modes
 
Optimizing Data Partitioning at Broadcasting the Data
Optimizing Data Partitioning at Broadcasting the DataOptimizing Data Partitioning at Broadcasting the Data
Optimizing Data Partitioning at Broadcasting the Data
 
Intermediate code generation in Compiler Design
Intermediate code generation in Compiler DesignIntermediate code generation in Compiler Design
Intermediate code generation in Compiler Design
 
Amba axi 29 3_2015
Amba axi 29 3_2015Amba axi 29 3_2015
Amba axi 29 3_2015
 

Último

Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
drjose256
 

Último (20)

Module-III Varried Flow.pptx GVF Definition, Water Surface Profile Dynamic Eq...
Module-III Varried Flow.pptx GVF Definition, Water Surface Profile Dynamic Eq...Module-III Varried Flow.pptx GVF Definition, Water Surface Profile Dynamic Eq...
Module-III Varried Flow.pptx GVF Definition, Water Surface Profile Dynamic Eq...
 
The battle for RAG, explore the pros and cons of using KnowledgeGraphs and Ve...
The battle for RAG, explore the pros and cons of using KnowledgeGraphs and Ve...The battle for RAG, explore the pros and cons of using KnowledgeGraphs and Ve...
The battle for RAG, explore the pros and cons of using KnowledgeGraphs and Ve...
 
Diploma Engineering Drawing Qp-2024 Ece .pdf
Diploma Engineering Drawing Qp-2024 Ece .pdfDiploma Engineering Drawing Qp-2024 Ece .pdf
Diploma Engineering Drawing Qp-2024 Ece .pdf
 
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
 
Artificial Intelligence Bayesian Reasoning
Artificial Intelligence Bayesian ReasoningArtificial Intelligence Bayesian Reasoning
Artificial Intelligence Bayesian Reasoning
 
Raashid final report on Embedded Systems
Raashid final report on Embedded SystemsRaashid final report on Embedded Systems
Raashid final report on Embedded Systems
 
How to Design and spec harmonic filter.pdf
How to Design and spec harmonic filter.pdfHow to Design and spec harmonic filter.pdf
How to Design and spec harmonic filter.pdf
 
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024
 
Lab Manual Arduino UNO Microcontrollar.docx
Lab Manual Arduino UNO Microcontrollar.docxLab Manual Arduino UNO Microcontrollar.docx
Lab Manual Arduino UNO Microcontrollar.docx
 
Introduction to Arduino Programming: Features of Arduino
Introduction to Arduino Programming: Features of ArduinoIntroduction to Arduino Programming: Features of Arduino
Introduction to Arduino Programming: Features of Arduino
 
Piping and instrumentation diagram p.pdf
Piping and instrumentation diagram p.pdfPiping and instrumentation diagram p.pdf
Piping and instrumentation diagram p.pdf
 
analog-vs-digital-communication (concept of analog and digital).pptx
analog-vs-digital-communication (concept of analog and digital).pptxanalog-vs-digital-communication (concept of analog and digital).pptx
analog-vs-digital-communication (concept of analog and digital).pptx
 
Supermarket billing system project report..pdf
Supermarket billing system project report..pdfSupermarket billing system project report..pdf
Supermarket billing system project report..pdf
 
Research Methodolgy & Intellectual Property Rights Series 2
Research Methodolgy & Intellectual Property Rights Series 2Research Methodolgy & Intellectual Property Rights Series 2
Research Methodolgy & Intellectual Property Rights Series 2
 
Introduction to Heat Exchangers: Principle, Types and Applications
Introduction to Heat Exchangers: Principle, Types and ApplicationsIntroduction to Heat Exchangers: Principle, Types and Applications
Introduction to Heat Exchangers: Principle, Types and Applications
 
Instruct Nirmaana 24-Smart and Lean Construction Through Technology.pdf
Instruct Nirmaana 24-Smart and Lean Construction Through Technology.pdfInstruct Nirmaana 24-Smart and Lean Construction Through Technology.pdf
Instruct Nirmaana 24-Smart and Lean Construction Through Technology.pdf
 
Intelligent Agents, A discovery on How A Rational Agent Acts
Intelligent Agents, A discovery on How A Rational Agent ActsIntelligent Agents, A discovery on How A Rational Agent Acts
Intelligent Agents, A discovery on How A Rational Agent Acts
 
Quiz application system project report..pdf
Quiz application system project report..pdfQuiz application system project report..pdf
Quiz application system project report..pdf
 
Filters for Electromagnetic Compatibility Applications
Filters for Electromagnetic Compatibility ApplicationsFilters for Electromagnetic Compatibility Applications
Filters for Electromagnetic Compatibility Applications
 
Research Methodolgy & Intellectual Property Rights Series 1
Research Methodolgy & Intellectual Property Rights Series 1Research Methodolgy & Intellectual Property Rights Series 1
Research Methodolgy & Intellectual Property Rights Series 1
 

Handout_fft_see_this.pdf Fast forrier transform

  • 1. B. Baas 442 FFT • There are many ways to decompose an FFT [Rabiner and Gold] • The simplest ones are radix-2 • Computation made up of radix-2 butterflies X = A + BW Y = A – BW A B
  • 2. B. Baas 443 FFT Dataflow Diagram • Dataflow diagram – N = 64 – radix-2 – 6 stages of computation Memory Locations 0 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60 63 Input Output
  • 3. B. Baas 444 Radix 2, 8-point FFT
  • 4. B. Baas 445 Radix 2, 8-point FFT Radix 2, 16-point FFT
  • 5. B. Baas 446 Radix 2, 8-point FFT Radix 2, 32-point FFT
  • 6. B. Baas 447 Radix 2, 8-point FFT Radix 2, 64-point FFT
  • 7. B. Baas 448 Radix 2, 256-point FFT Radix 2, 256-point FFT
  • 8. B. Baas 449 Radix 4, 16-point FFT
  • 9. B. Baas 450 Radix 4, 64-point FFT
  • 10. B. Baas 451 Radix 4, 256-point FFT
  • 11. B. Baas 452 Radix 2, Decimation-In-Time (DIT) • Input order “decimated”—needs bit reversal • Output in order • Butterfly: X = A + BW Y = A – BW A B x car-sav add + A B W Critical path:
  • 12. B. Baas 453 Radix 2, Decimation In Frequency (DIF) • Input in order • Output “decimated”—needs bit reversal • Butterfly: – Two CPAs – Wider multiplier X = A + B Y = (A – B) W A B x + + (–) B Critical path: A W
  • 13. B. Baas 454 Radix 4, DIT Butterfly • Decimation in Time (DIT) or Decimation in Frequency (DIF) W X B C D A Y V
  • 14. Bit-Reversed Addressing • Normally: – DIT: bit-reverse inputs before processing – DIF: bit-reverse outputs after processing • Reverse addressing bits for read/write of data – 000 (0)  000 (0) # Word 0 does not move location – 001 (1)  100 (4) # Original word 1 goes to location 4 – 010 (2)  010 (2) # Word 2 does not move location – 011 (3)  110 (6) # Original word 3 goes to location 6 – 100 (4)  001 (1) # Original word 4 goes to location 1 – 101 (5)  101 (5) # Word 5 does not move location – 110 (6)  011 (3) # Original word 6 goes to location 3 – 111 (7)  111 (7) # Word 7 does not move location BIT REVERSAL BIT REVERSAL B. Baas 455
  • 15. B. Baas 456 Addressing In Matlab (Especially helpful for FFTs) • Matlab – Matlab can not index arrays with index zero! • In matlab, do address calculations normally – AddrA = 0, 2, 4, … AddrB = 1, 3, 5, … • then use pointers with an offset of one whenever indexing arrays – AddrA = ……; AddrB = ……; A = data(AddrA+1); B = data(AddrB+1); … data(AddrA+1) = X; data(AddrB+1) = Y;
  • 16. B. Baas 457 Higher Radices • Radix 2 and radix 4 are certainly the most popular • Radix 4 is on the order of 20% more efficient than radix 2 for large transforms • Radix 8 is sometimes used, but longer radix butterflies are not common because additional efficiencies are small and added complexity is non- trivial (especially for hardware implementations)
  • 17. B. Baas 458 I. Common-Factor FFTs • Key characteristics – Most common class of FFTs – Also called Cooley-Tukey FFTs – Factors of N used in decomposition have common factor(s) • A) Radix-r – N = rk, where k is a positive integer – Butterflies used in each stage are the same – Radix-r butterflies are used – N/r butterflies per stage – k = logr N stages
  • 18. B. Baas 459 I. Common-Factor FFTs • B) Mixed-radix – Radices of component butterflies are not all equal – More complex than radix-r – Is necessary if N ≠ rk – Example • N = 32 • Could calculate with two radix-4 stages and one radix-2 stage
  • 19. B. Baas 460 II. Prime-Factor FFTs • The length of transforms must be the product of relatively prime numbers • This can be limiting, though it is often possible to find lengths near popular power-of-2 lengths (e.g., 7 x 11 x 13 = 1003) • Their great advantage is that they have no WN twiddle factor multiplications • Irregular sorting of input and output data • Irregular addressing for butterflies
  • 20. B. Baas 461 III. Other FFTs • Split-radix FFT – When N = pk, where p is a small prime number and k is a positive integer, this method can be more efficient than standard radix-p FFTs – “Split-radix Algorithms for Length-pm DFT’s,” Vetterli and Duhamel, Trans. on Acoustics, Speech, and Signal Processing, Jan. 1989 X Y A B V W C D Wa Wb
  • 21. B. Baas 462 III. Other FFTs • Winograd Fourier Transform Algorithm (WFTA) – Type of prime factor algorithm based on DFT building blocks using a highly efficient convolution algorithm – Requires many additions but only order N multiplications – Has one of the most complex and irregular structures • FFTW (www.fftw.org) – C subroutine libraries highly tuned for specific architectures • Goertzel DFT – Not a “normal” FFT in that its computational complexity is still order N2 – It allows a subset of the DFT’s N output terms to be efficiently calculated
  • 22. B. Baas 463 Signal Growth • Note in DFT equation signal can grow by N times • This is also seen in the FFT in its growth by r times in a radix-r butterfly, and logrN stages in the entire transform: r ^ (logrN) = N • Thus, the FFT processor requires careful scaling – Floating point number representation • Easiest conceptually, but expensive hardware. Typically not used in efficient DSP systems. – Fixed-point with scaling by 1/r every stage • First stage is a special case. Scaling must be done on the inputs before processing to avoid overflow with large magnitude complex inputs with certain phases. – Block floating point
  • 23. B. Baas 464 Efficient Computation of the IFFT • out = IFFT(in) • 0) Design a separate processor for IFFTs • Re-use a forward FFT engine if available – 1) Swapping real and imaginary parts: a = fft(imag(in) + i*real(in)); out = (imag(a) + i*real(a)); – 2) Using conjugates: a = fft(conj(in)); out = conj(a); – 3) A simple indexing change: a = fft(in); out = [a(0) a(N-1:-1:1)]; % with normal indices out = [a(1) a(N :-1:2)]; % with weird matlab indices