SlideShare uma empresa Scribd logo
1 de 5
Baixar para ler offline
MDCT AUDIO CODING WITH PULSE VECTOR QUANTIZERS
Jonas Svedberg, Volodya Grancharov, Sigurdur Sverrisson, Erik Norvell, Tomas Toftgård, Harald
Pobloth, and Stefan Bruhn
SMN, Ericsson Research, Ericsson AB
164 80, Stockholm, Sweden
ABSTRACT
This paper describes a novel audio coding algorithm that is a
building block in the recently standardized 3GPP EVS codec [1].
The presented scheme operates in the Modified Discrete Cosine
Transform (MDCT) domain and deploys a Split-PVQ pulse coding
quantizer, a noise-fill, and a gain control optimized for the
quantizer’s properties. A complexity analysis in terms of WMOPS
is presented to illustrate that the proposed Split-PVQ concept and
dynamic range optimized MPVQ-indexing are suitable for real-
time audio coding. Test results from formal MOS subjective
evaluations and objective performance figures are presented to
illustrate the competitiveness of the proposed algorithm.
Index Terms— Audio Coding, MDCT, PVQ, Noise-Fill, EVS
1. INTRODUCTION
In the art of high quality speech and audio coding there is a need to
efficiently quantize the residual signals after an initial dynamic
compression (or prediction) step. At high rates a scalar quantizer
(SQ) followed by entropy coding has been in widespread use [2],
while at medium rates trained codebooks [3] or lattice vector
quantizers (LVQs) [4] have been in use. The main disadvantage
with trained codebooks is that they do not scale well to higher rates
in terms of storage and search complexity. Scalar quantization with
entropy coding has the drawback that the entropy statistics needs to
be stored and also trained on the correct material, and the output is
by nature variable rate (requiring a rate control loop). For low rate
speech coding fixed rate sparse, multi-pulse and phase structured
pulse vector quantizers have been in successful use since the early
90’s [5]. At low rates they have provided a good trade-off between
performance and implementation complexity in terms of both
storage and cycles, however without an extendable structure, they
have been difficult to apply to higher bit rates. An alternative to
pulse vector quantizers at higher rates are lattice quantizers, such
as D8 [6] and rotated E8 [4]. Due to their regular structure they
have very good search properties. However, the handling of lattice
outliers and the task of indexing a point in an extended lattice and
reconstructing the point from the received index may be
cumbersome. Further lattice quantizers require entropy coding to
provide good performance [7], and they have a dimensional
rigidity, using fixed vector lengths such as 8 or 24 [8].
The Pyramid Vector Quantizer (PVQ) is a structured pulse
vector quantizer introduced in the 80’s by Fischer [9] [10] (for
Laplacian sources). The pyramid structure enables an efficient
search, and by limiting the indexing problem to smaller controlled
size sub-sections in [11], [12] and [13], it has been shown that it
may give a good tradeoff between performance and complexity.
When employed in a transform domain codec, the energy
focusing property of a pulse based quantizer may at lower rates
result in uncovering perceptual artifacts, such as local energy
losses, which need to be addressed in a post-processing step. In
this paper we first give an overview of the MDCT based codec,
followed by a presentation of novel modules used to quantize and
post-process the band normalized transform coefficients using a
PVQ. Finally, to demonstrate the performance of the introduced
Split-PVQ concept and the pulse coding optimized noise-fill,
subjective and objective measurements are presented.
2. CODEC OVERVIEW
The proposed MDCT audio codec is operating along the lines of
the ITU-T G.719 framework [6]. As a coding mode in the EVS
codec, it handles sample rates of 16/32/48 kHz (WB/SWB/FB) and
is used at bit rates of 24.4, 32 and 64 kbps. An overview of the
system is depicted in Figure 1. Similar to the G.719 scheme it
operates on overlapping input audio frames )(tx but here a
reduced algorithmic delay is achieved by using an alternative
asymmetric window [14] which has a smaller overlap compared to
the standard sinusoid window. The windowed, time-aliased input
)(~ tx is transformed to MDCT[15] coefficients )(ky using DCTIV:
( )( )∑
−
=
−=





++=
1
0
1,,0,2/12/1cos)(~)(
L
n
Lk
L
kttxky 2
π
. (1)
errgˆ
Norm calculation
and coding
Bit
allocation
Adaptive
windowing
and
transform
Spectrum
normalization
PVQ
gain & shape
quantization
Norm
decoding
Bit
allocation
Inverse
transform and
overlap-add
Spectrum
scaling
PVQ
gain & shape
decoding
ENCODER
DECODER
)(tx
)(ˆ tx
)(ˆ bny
)(bny
)(ˆ bE )(bB
)(ˆ bE
)(bB
)(ky )(ˆ bny
Noise-fill
errgˆ
Figure 1: Overview of the MDCT based codec system
The MDCT spectrum is partitioned into bands, where each band b
is associated with a norm factor )(bE , a quantized norm )(ˆ bE and
a norm index )(bI . The set of )(bI are input to a bit distribution
algorithm to produce a bit allocation )(bB . Each band is
normalized using the quantized norm )(ˆ/)()( bEbbn yy = , with
[ ]T
bbb eysysyb )(,),1(),()( +=y , where bs and be denote the
start and end indices of band b . The band structure for 48 kHz is
the same as in [6], but a complete list of band structures can be
found in [14]. The normalized bands are then encoded using the
PVQ described in section 3. Additional fine gain adjustments are
done for each band as outlined in section 4. The decoder may
reconstruct the encoded spectrum from the encoded parameters by
reversing the steps of the encoder. The bands with zero allocated
bits 0)( =bB are filled using methods described in section 5.
3. QUANTIZER
The normalized MDCT-coefficients in each band are quantized
using a Split-PVQ concept. Bands where the allocated bits per
band (excluding bits for fine gain coding) )(bR are less than a
threshold are directly quantized into an energy normalized PVQ
vector nyˆ . Bands with a higher number of allocated bits are first
split into smaller target segments using an adaptive split
methodology and then each smaller segment is coded. In both
cases the resulting codewords are subsequently encoded by a range
encoder [16] supporting both uniform and tailor made probability
density functions (PDFs).
3.1. Split-PVQ Methodology
Pulse position coding often results in large codeword indices,
especially for long input vectors, due to the rapidly increasing
number of combinations for large dimensions. For low complexity
implementations a process of distributing the available bits
between shorter sub segments can be used. Earlier examples of
binary recursive splitting approaches can be found in [11] and in
the indexing of [12]. An issue with those approaches is that the
segmentation may result in a split vector with very different
segment sizes. This may result in very inefficient positional coding.
To demonstrate the inefficiency of non-uniform segmentation,
consider a split of a 16 dimensional vector in two ways: A)
symmetric (8+8) and B) asymmetric (2+14). Assume there are 2
pulses to code in each segment, then not considering overlap and
sign, the number of codewords in each segment is )!(!/! KNKN − ,
where N is the segment dimension and K is the number of unit
pulses. This results in A) 28+28 = 56 combinations, and B) 1+91 =
92 combinations, which motivates the usage of more symmetric
segmentations. In the presented scheme, an algorithm for band
splitting is activated when the bit budget for a particular
band )(bR exceeds a pre-determined threshold. The input vector
)(bny is split into uniform (or close to uniform) segments segy in a
non-recursive way. The number of segments is calculated as:
 )/()( splitcostsegmaxp RRbRN += , (2)
where 32=segmaxR is the maximum allowed number of bits for a
segment and splitcostR is an experimentally determined split
overhead cost. If the number of allocated bits for each segment is
close to segmaxR and the variance of the segment energies is high,
the number of splits is increased by one, allowing for higher
flexibility in the segment bit allocation.
Each segment’s shape is individually quantized and energy
ratios describe the relation between the segment energies. A
difficulty in this approach comes from the fact that energy
variations between different segments might be large, which is
unfavorable for the coding of relative energies. This is managed by
calculating angles representing energy ratios between a left and
right group of segments, with as equal number of segments as
possible, using a recursive top-down approach. At level l of the
recursion the angle is determined as ( ) ,arctan 




= l
L
l
R
l
EEa
where l
LE and l
RE are the energies of the left and right groups. In
case of even pN , the first level angle will represent the energy
ratio between a left and right group with an equal number of
segments. At each step of the recursion additional angles within
each group are determined until the left and the right groups
consist of one segment each. In case of an odd number of splits,
the angle calculated in the first iteration will have a larger number
of segments in the right group than in the left.
The angles are additionally used to recursively distribute bits
to the already determined segments. At each level of recursion the
bits l
LR and l
RR allocated to the left and the right groups are
determined from the available number of bits l
R (after angle
coding), the lengths of the groups l
LL and l
RL (number of
coefficients), and the angle l
α by:
( )( ) .,
1
tanlog2 l
L
ll
Rl
L
l
R
l
R
ll
l
L RRR
LL
LR
R −=
+
⋅−
=
a
(3)
E.g. if the allocated bits for the PVQ )(bR require a split into
three segments, the angle calculation and bit distribution is done as
follows:
Step 0: The angle 0
α is used to distribute the bits 0
R , i.e.
)(bR minus the bits used to code 0
α , to the left and the right
groups of segments ( 0
LR and 0
RR ). The left group includes just one
segment while the right group includes two segments.
Step 1: The right group from Step 0 is split in two groups
where the angle 1
Rα is used to distribute the remaining bits of 0
RR
(after coding of 1
Rα ) to 1
RLR (the second segment), 1
RRR (the third
segment). The bits 0
LR are directly used to code the first segment.
3.2. Normalization and Shape Search
For quantization of each split segment segy we employ a unit
hyper-sphere normalization approach similar to the concept in [8],
however now applied in the PVQ-context (as in [12]). The
quantized split segment ),,(ˆ KNgsegsegy is defined as:
KNKN
KN
segsegseg gKNg
,,
,
),,(ˆ
zz
z
y
T
= , (4)
512,,1,64,,1,:
1
0
, 22 ==








== ∑
−
=
KNKee
N
i
iKNz , (5)
where KN,z is an integer point on the surface of an N -dimensional
hyper-pyramid such that the L1 norm of KN,z is K , and segg is
the segment gain ( )cos(α or )sin(α at the lowest split level ). The
goal of the PVQ search is to minimize the shape error between the
split segments segy and segyˆ . The search is performed by first
using a projection [9] to a lower sub-pyramid, followed by an
iterative approach (as described in [12]) of evaluating the addition
of a single unit pulse to each dimension until the target pyramid is
reached. Further, the search keeps track of the previous pyramid’s
maximum accumulated amplitude, to speed up the innermost loop
in a fixed point implementation.
3.3. Split-PVQ Codeword Encoding
For each split angleα , angle bits are allocated based on the total
number of bits allocated for the two groups of segments and the
length RL LL + . The angles are uniformly quantized and then
encoded by the range encoder. Efficient coding is achieved by
modelling the angle statistics using an asymmetric triangular PDF,
where the asymmetry is given by the ratio RL LL / .
The KN ,z vector is then indexed using a method that has
similarities to both the product code enumeration in [11] and the
optimized Fischer recursion in [12]. Two shape codewords are
composed as follows: a first codeword representing the first sign
encountered in the vector independent of its position; a second
codeword representing (in a recursive fashion) all the remaining
pulses in the remaining vector which is now guaranteed to have a
leading positive pulse. The second codeword is enumerated using
the structure displayed in Table 1. The recursive structure defines
the ),( KNU offset matrix and enables the recursion computations
to stay within the 1−B dynamics of a B bits signed integer.
Lead value Section size Section definition
K 1
The all pulses consumed case;
zeroes in remaining dimensions
1
1

−K
),(2 KNU⋅
All initial pulse amplitude cases
with a subsequent new leading
sign, (positive or negative).
0
0
 ),1( KNNMPVQ −
The no initial pulse consumed
cases; the current leading sign
is kept for the next dimension.
Table 1: Modular-PVQ (MPVQ) indexing structure
From Table 1 we find that the total number of entries (with the
very first leading sign information removed) can be expressed as:
),1(),(21),( KNNKNUKNN MPVQMPVQ −+⋅+= (6)
Combining (6) with Fischer’s PVQ-recursion in [9], we find that
the total number of entries can be expressed as:
)1,(),(1),( +++= KNUKNUKNNMPVQ (7)
Runtime computed values of the ),( KNU matrix may now be used
as the basis for the MPVQ-indexing and the update of the
symmetric U matrix from row 1−N to row N can be performed as:
),()1,1(),1(1)1,( KNUKNUKNUKNU ++−+−+=+ , (8)
with initial conditions, 0),1(),0()1,()0,( ==== KUKUNUNU .
The two short codewords of size 1 and size 1−B are finally sent to
the range encoder using a uniform PDF.
4. GAIN CONTROL
The Split-PVQ is a pure shape quantizer and a fine gain is needed
to scale the quantized shape to the correct level. To determine the
fine gain the allocated bits )(bB are split into bits for the PVQ
quantizer )(bR and bits for a fine gain quantizer )(bRFG according
to:
 ( )
,)()()(
7,/)(xam)(
))(()(





−=
=
=
bRbBbR
LbBbR
bRbitsbR
FG
bsamp
sampFGFG
(9)
where )( sampFG Rbits is a lookup-table for fine gain bits. After
quantizing the shape vector, a fine gain prediction is made using an
accuracy measure )(ba based on the bandwidth bL , the total
number of pulses used in the Split-PVQ encoding )(bN pulses and
the highest identified integer pulse amplitude )(max bP in the set of
encoded shape vectors KN,z , forming )(ˆ bny , according to:



⋅=
−=
.)(/)()(
)(/05.01)(
bPLbNba
babg
maxbpulses
pred
(10)
For bands where 0)( >bRFG , a fine gain prediction error is
computed as )()()( bgbgbg predfgerr = , where the fine gain is
1
))(ˆ)(ˆ()(ˆ)()( −
⋅= bbbbbg n
T
nn
T
nfg yyyy , and )(ˆ bny has been
normalized to unit root mean square (RMS). The gain prediction
error )(bgerr is encoded with a non-uniform scalar quantizer in
logarithmic domain using )(bRFG bits. The decoder reconstructs
the fine gain adjustment through:
)()(ˆ)(ˆ bgbgbg prederrfg ⋅= , (11)
with 1)(ˆ =bgerr when 0)( =bRFG . The final scaling of the
reconstructed band is obtained as:
)(ˆ)(ˆ)(ˆ)(ˆ bgbEbb fgn ⋅⋅= yy . (12)
5. NOISE-FILL
The use of a pulse based VQ may give a peaky and sparse fine
structure which may be less suitable for noise-like signals. The
main spectral filling concept is to use the encoded fine structure to
fill the unencoded bands. This way a similarity to the surrounding
fine structure is preserved. In case of low spectral stability at lower
SWB rates, a two-stage densification procedure is used, as
illustrated in Figure 2.
Figure 2: Dual spectral fill codebook generation
Quantized band
Non-quantized band
1Y
2Y
Compression & Inclusion decision
Densification
A compression of the coded residual vectors is performed as:



=
≠
=
,0ˆ,0
0ˆ)),(ˆ(sign
)(ˆ ,
n
nn
compn
y
yky
ky (13)
where )(ˆ kyn refers to elements in )(ˆ bny . The virtual codebook
which constitutes the spectral codebook is built from only the non-
sparse sub-vectors of )(ˆ kyn,comp where each sub-vector has a length
of 8. Compressed sub-vectors that do not fulfill a sparsity criterion:
8
,...,0,2)(ˆ
78
8
,
b
is
isk
compn
L
iky
b
b
=≥∑
++
+=
, (14)
are rejected. The remaining compressed sub-vectors are
concatenated into codebook )(1 kY , of length YL . For a second
codebook )(2 kY , YLk ,,0 = a densification process is used:
( ) ( )




=−
≠−+×
=
.0)(,)(
0)(,)()()(sign
)(
11
1111
2
kkL
kkLkk
k
Y
Y
YY
YYYY
Y (15)
Figure 3 depicts the filling procedure, where )(1 kY is used below
the crossover frequency cbf = 4.8 kHz and )(2 kY is used above.
The spectral fill above the last encoded band is handled the same
way as in [6] for 64 kbps, while for 24.4 and 32 kbps there are
complementary spectral fill methods in use, as described in [14].
Quantized band Filled bandFilled band
cbf
1Y 2Y
2Y1Y
Figure 3: Spectral filling from two codebooks
6. PERFORMANCE
6.1. Objective Performance
Figure 4 displays a SNR-comparison for mixed and music content
in relation to an RE8+D8 based lattice quantizer [6]. The evaluated
Split-PVQ includes both the shape and the fine gain as the lattice
quantizer captures both shape and gain of the quantized vectors.
6
10
14
18
22
26
16 32 48 64 80 96
SegmentalSNR[dB]
Bit rate [kbps]
Split-PVQ
RE8+D8
(G.719)
Figure 4: Segmental SNR (per frame) vs. bit rate
In Figure 5, the complexity in terms of fixed point WMOPS [17]
for the described encoder side quantization algorithms is shown.
0%
50%
100%
24.4 32 48 64 96
(4.3) (5.6) (8.1) (10.0) (12.6)
Bit rate[kbps]
Split-encoding
Pulse-search
MPVQ-indexing
Range-encoding
PVQ-normalization
Gain-prediction
( Total [WMOPS] )
Figure 5: Encoding WMOPS vs. bit rate
At the decoder side the MPVQ de-indexing complexity is on
average 25% heavier than the indexing, and the range decoder
complexity is roughly 150% of the range encoding complexity.
6.2. Subjective Evaluation
In addition to the EVS selection tests [18], numerous tests by the
proponent companies have been performed. The listening
environment, levels, etc. in the proponents’ test are described in
[19]. The performance of the technologies described in this paper
is exemplified by the performance of the EVS codec for WB mixed
and music content. Table 2 below gives average results for two
P.800 DCR [20] subjective tests of the EVS codec performed by
the proponents for WB mixed and music content with 20 listeners
in each test. In addition, for SWB mixed and music content, the
proponents observed results that are not worse than the original for
64 kbps. For 32 kbps some tests revealed statistical significant
improvements compared to G.719@32kbps while other tests only
concluded an increase of the average MOS score within the
confidence region. It should be noted that for SWB at rates 24.4
and 32 kbps a larger portion of the band is subject to spectral fill
using methods which are outside the scope of this paper. Hence,
the WB results are deemed more relevant for the techniques
described here.
Bit rate MOS score
32 kbps G.722.1 EVS
4.18 4.42
64 kbps G.722 EVS
4.20 4.54
Table 2: Subjective MOS scores (WB P.800 DCR)
7. CONCLUSIONS
The objective and subjective experiments confirm that the
proposed novel MDCT-codec design is superior to several legacy
coding schemes. The complexity of the introduced Split-PVQ
concept, the dynamic range optimized MPVQ indexing, and the
efficient fine gain prediction allows their usage for real-time
coding applications. The results from formal MOS subjective
evaluations indicate that the designed system including the novel
noise-fill strategy, tailored for pulse-coding schemes, provides a
significant perceptual improvement.
8. REFERENCES
[1] 3GPP TS 26.441, “EVS Codec General Overview”, 2014
[2] 3GPP TS 26.401, “Enhanced aacPlus general audio codec
General Description”, 2004
[3] K. Ikeda, T. Moriya, N. Iwakami, S. Miki, ”A design of Twin
VQ audio-codec for personal communication systems”, IEEE
ICUPC 1995, pp 803-807, 1995
[4] J. Makinen, B. Bessette, S. Bruhn, P. Ojala, R. Salami, A.
Taleb, “AMR-WB+: a new audio coding standard for 3rd
generation mobile audio services”, ICASSP 2005, pp.
ii/1109- ii/1112 , Vol. 2, 2005
[5] R. Salami, C. LaFlamme, J-P. Adoul, D. Massaloux, “A Toll
Quality 8 Kb/s Speech Codec for the Personal
Communcations System (PCS)”, IEEE Trans. on Vehicular
Technology, Vol.43, NO. 3, August 1994
[6] ITU-T Rec. G.719, “Low-complexity full-band audio coding
for high-quality conversational applications”, 2008
[7] A. Gerso, R. Gray, ”Vector Quantization and Signal
Compression”, Kluwer Academic Publishers, 1992
[8] J.P. Adoul, C. Lamblin, A. Leguader, “Baseband speech
Coding at 2400bps using ‘Spherical Vector Quantization’”,
ICASSP 1984, pp. 45-48, Vol. 9, 1984
[9] T.R. Fischer, “A Vector Laplacian Quantizer”, Proceedings
21st
annual Allerton Conf. on Comm., Control and
Computing, October 5-7, pp 501-510, 1983
[10] T. Fischer, “A Pyramid Vector Quantizer,” IEEE Trans. Inf.
Theory, vol IT-32, NO. 4, 1986
[11] A.C. Hung, T.H-Y Meng, “Error Resilient Pyramid Vector
Quantization for Image Compression”, in Proc. IEEE Int.
Conf. Image Processing, 1994, pp 583-587, 1994
[12] J.-M. Valin, T. B. Terriberry, G. Maxwell, “A Full-
Bandwidth Audio Codec with Low Complexity and Very
Low Delay”, Proc. EUSIPCO, 2009
[13] J-M. Valin, K. Vos , T. Terriberry, “Definition of the Opus
Audio Codec”, IETF/RFC6716, September 2012
[14] 3GPP TS 26.445, “EVS Codec Detailed Algorithmic
Description”, 2014
[15] H. S. Malvar, “Signal Processing with Lapped Transforms”,
Norwood, MA, Artech House, 1992
[16] G. Nigel, N. Martin, “Range encoding: An algorithm for
removing redundancy from a digitized message”, Video &
Data Recording Conf., 1979
[17] ITU-T Rec. G.191, “Software tools for speech and audio
coding standardization”, “BASOP: ITU-T Basic Operators”,
2009
[18] 3GPP Tdoc S4-141065, “Report of the Global Analysis Lab
for the EVS Selection Phase”, Dynastat Inc., 2014
[19] 3GPP Tdoc AHEVS-311, “EVS Permanent Document EVS-
8b: Test plans for selection phase including lab task
specification”, 2014
[20] ITU-T Rec. P.800, “Methods for Subjective Determination
of Transmission Quality”, 1996

Mais conteúdo relacionado

Mais procurados

Design and Performance Analysis of Convolutional Encoder and Viterbi Decoder ...
Design and Performance Analysis of Convolutional Encoder and Viterbi Decoder ...Design and Performance Analysis of Convolutional Encoder and Viterbi Decoder ...
Design and Performance Analysis of Convolutional Encoder and Viterbi Decoder ...IJERA Editor
 
A comparative study of different multiplier designs
A comparative study of different multiplier designsA comparative study of different multiplier designs
A comparative study of different multiplier designsHoopeer Hoopeer
 
Mimo radar detection in compound gaussian clutter using orthogonal discrete f...
Mimo radar detection in compound gaussian clutter using orthogonal discrete f...Mimo radar detection in compound gaussian clutter using orthogonal discrete f...
Mimo radar detection in compound gaussian clutter using orthogonal discrete f...ijma
 
An Energy-Efficient Lut-Log-Bcjr Architecture Using Constant Log Bcjr Algorithm
An Energy-Efficient Lut-Log-Bcjr Architecture Using Constant Log Bcjr AlgorithmAn Energy-Efficient Lut-Log-Bcjr Architecture Using Constant Log Bcjr Algorithm
An Energy-Efficient Lut-Log-Bcjr Architecture Using Constant Log Bcjr AlgorithmIJERA Editor
 
EE402B Radio Systems and Personal Communication Networks notes
EE402B Radio Systems and Personal Communication Networks notesEE402B Radio Systems and Personal Communication Networks notes
EE402B Radio Systems and Personal Communication Networks notesHaris Hassan
 
Iaetsd implementation of power efficient iterative logarithmic multiplier usi...
Iaetsd implementation of power efficient iterative logarithmic multiplier usi...Iaetsd implementation of power efficient iterative logarithmic multiplier usi...
Iaetsd implementation of power efficient iterative logarithmic multiplier usi...Iaetsd Iaetsd
 
On The Fundamental Aspects of Demodulation
On The Fundamental Aspects of DemodulationOn The Fundamental Aspects of Demodulation
On The Fundamental Aspects of DemodulationCSCJournals
 
Frequency domain behavior of S-parameters piecewise-linear fitting in a digit...
Frequency domain behavior of S-parameters piecewise-linear fitting in a digit...Frequency domain behavior of S-parameters piecewise-linear fitting in a digit...
Frequency domain behavior of S-parameters piecewise-linear fitting in a digit...Piero Belforte
 
Area efficient parallel LFSR for cyclic redundancy check
Area efficient parallel LFSR for cyclic redundancy check  Area efficient parallel LFSR for cyclic redundancy check
Area efficient parallel LFSR for cyclic redundancy check IJECEIAES
 
Multi carrier equalization by restoration of redundanc y (merry) for adaptive...
Multi carrier equalization by restoration of redundanc y (merry) for adaptive...Multi carrier equalization by restoration of redundanc y (merry) for adaptive...
Multi carrier equalization by restoration of redundanc y (merry) for adaptive...IJNSA Journal
 
Frame Synchronization for OFDMA mode of WMAN
Frame Synchronization for OFDMA mode of WMANFrame Synchronization for OFDMA mode of WMAN
Frame Synchronization for OFDMA mode of WMANPushpa Kotipalli
 
Low Complexity Multi-User MIMO Detection for Uplink SCMA System Using Expecta...
Low Complexity Multi-User MIMO Detection for Uplink SCMA System Using Expecta...Low Complexity Multi-User MIMO Detection for Uplink SCMA System Using Expecta...
Low Complexity Multi-User MIMO Detection for Uplink SCMA System Using Expecta...TELKOMNIKA JOURNAL
 
129966863283913778[1]
129966863283913778[1]129966863283913778[1]
129966863283913778[1]威華 王
 

Mais procurados (19)

iscas07
iscas07iscas07
iscas07
 
Design and Performance Analysis of Convolutional Encoder and Viterbi Decoder ...
Design and Performance Analysis of Convolutional Encoder and Viterbi Decoder ...Design and Performance Analysis of Convolutional Encoder and Viterbi Decoder ...
Design and Performance Analysis of Convolutional Encoder and Viterbi Decoder ...
 
A comparative study of different multiplier designs
A comparative study of different multiplier designsA comparative study of different multiplier designs
A comparative study of different multiplier designs
 
Mimo radar detection in compound gaussian clutter using orthogonal discrete f...
Mimo radar detection in compound gaussian clutter using orthogonal discrete f...Mimo radar detection in compound gaussian clutter using orthogonal discrete f...
Mimo radar detection in compound gaussian clutter using orthogonal discrete f...
 
An Energy-Efficient Lut-Log-Bcjr Architecture Using Constant Log Bcjr Algorithm
An Energy-Efficient Lut-Log-Bcjr Architecture Using Constant Log Bcjr AlgorithmAn Energy-Efficient Lut-Log-Bcjr Architecture Using Constant Log Bcjr Algorithm
An Energy-Efficient Lut-Log-Bcjr Architecture Using Constant Log Bcjr Algorithm
 
EE402B Radio Systems and Personal Communication Networks notes
EE402B Radio Systems and Personal Communication Networks notesEE402B Radio Systems and Personal Communication Networks notes
EE402B Radio Systems and Personal Communication Networks notes
 
paper
paperpaper
paper
 
Iaetsd implementation of power efficient iterative logarithmic multiplier usi...
Iaetsd implementation of power efficient iterative logarithmic multiplier usi...Iaetsd implementation of power efficient iterative logarithmic multiplier usi...
Iaetsd implementation of power efficient iterative logarithmic multiplier usi...
 
R044120124
R044120124R044120124
R044120124
 
Gt3612201224
Gt3612201224Gt3612201224
Gt3612201224
 
On The Fundamental Aspects of Demodulation
On The Fundamental Aspects of DemodulationOn The Fundamental Aspects of Demodulation
On The Fundamental Aspects of Demodulation
 
Frequency domain behavior of S-parameters piecewise-linear fitting in a digit...
Frequency domain behavior of S-parameters piecewise-linear fitting in a digit...Frequency domain behavior of S-parameters piecewise-linear fitting in a digit...
Frequency domain behavior of S-parameters piecewise-linear fitting in a digit...
 
Area efficient parallel LFSR for cyclic redundancy check
Area efficient parallel LFSR for cyclic redundancy check  Area efficient parallel LFSR for cyclic redundancy check
Area efficient parallel LFSR for cyclic redundancy check
 
Multi carrier equalization by restoration of redundanc y (merry) for adaptive...
Multi carrier equalization by restoration of redundanc y (merry) for adaptive...Multi carrier equalization by restoration of redundanc y (merry) for adaptive...
Multi carrier equalization by restoration of redundanc y (merry) for adaptive...
 
Dg34662666
Dg34662666Dg34662666
Dg34662666
 
Frame Synchronization for OFDMA mode of WMAN
Frame Synchronization for OFDMA mode of WMANFrame Synchronization for OFDMA mode of WMAN
Frame Synchronization for OFDMA mode of WMAN
 
Low Complexity Multi-User MIMO Detection for Uplink SCMA System Using Expecta...
Low Complexity Multi-User MIMO Detection for Uplink SCMA System Using Expecta...Low Complexity Multi-User MIMO Detection for Uplink SCMA System Using Expecta...
Low Complexity Multi-User MIMO Detection for Uplink SCMA System Using Expecta...
 
Cb25464467
Cb25464467Cb25464467
Cb25464467
 
129966863283913778[1]
129966863283913778[1]129966863283913778[1]
129966863283913778[1]
 

Destaque

Ericsson Sustainability and Corporate Responsibility Report 2012
Ericsson Sustainability and Corporate Responsibility Report 2012Ericsson Sustainability and Corporate Responsibility Report 2012
Ericsson Sustainability and Corporate Responsibility Report 2012Ericsson
 
Harmonic vector quantization
Harmonic vector quantizationHarmonic vector quantization
Harmonic vector quantizationEricsson
 
Winning the Game
Winning the GameWinning the Game
Winning the GameEricsson
 
Ericsson ConsumerLab: Personal Information Economy
   Ericsson ConsumerLab: Personal Information Economy   Ericsson ConsumerLab: Personal Information Economy
Ericsson ConsumerLab: Personal Information EconomyEricsson
 
Ericsson ConsumerLab: Connected lifestyles’ expectations identified
Ericsson ConsumerLab: Connected lifestyles’ expectations identifiedEricsson ConsumerLab: Connected lifestyles’ expectations identified
Ericsson ConsumerLab: Connected lifestyles’ expectations identifiedEricsson
 
Ericsson Review: HSPA evolution for future mobile-broadband needs
Ericsson Review: HSPA evolution for future mobile-broadband needsEricsson Review: HSPA evolution for future mobile-broadband needs
Ericsson Review: HSPA evolution for future mobile-broadband needsEricsson
 
The sharing economy
The sharing economyThe sharing economy
The sharing economyEricsson
 
Internet of Things from a Networked Society perspective
Internet of Things from a Networked Society perspectiveInternet of Things from a Networked Society perspective
Internet of Things from a Networked Society perspectiveEricsson
 
Consumers and the Internet of Things
Consumers and the Internet of ThingsConsumers and the Internet of Things
Consumers and the Internet of ThingsEricsson
 
API Market Demand
API Market DemandAPI Market Demand
API Market DemandEricsson
 
Ericsson Review: The benefits of self-organizing backhaul networks
Ericsson Review: The benefits of self-organizing backhaul networksEricsson Review: The benefits of self-organizing backhaul networks
Ericsson Review: The benefits of self-organizing backhaul networksEricsson
 
Ericsson Technology Review, issue #1, 2016
Ericsson Technology Review, issue #1, 2016Ericsson Technology Review, issue #1, 2016
Ericsson Technology Review, issue #1, 2016Ericsson
 
Scaling Internet of Things
Scaling Internet of ThingsScaling Internet of Things
Scaling Internet of ThingsEricsson
 
Ericsson ConsumerLab: The one-click ideal - Presentation
Ericsson ConsumerLab: The one-click ideal - PresentationEricsson ConsumerLab: The one-click ideal - Presentation
Ericsson ConsumerLab: The one-click ideal - PresentationEricsson
 

Destaque (14)

Ericsson Sustainability and Corporate Responsibility Report 2012
Ericsson Sustainability and Corporate Responsibility Report 2012Ericsson Sustainability and Corporate Responsibility Report 2012
Ericsson Sustainability and Corporate Responsibility Report 2012
 
Harmonic vector quantization
Harmonic vector quantizationHarmonic vector quantization
Harmonic vector quantization
 
Winning the Game
Winning the GameWinning the Game
Winning the Game
 
Ericsson ConsumerLab: Personal Information Economy
   Ericsson ConsumerLab: Personal Information Economy   Ericsson ConsumerLab: Personal Information Economy
Ericsson ConsumerLab: Personal Information Economy
 
Ericsson ConsumerLab: Connected lifestyles’ expectations identified
Ericsson ConsumerLab: Connected lifestyles’ expectations identifiedEricsson ConsumerLab: Connected lifestyles’ expectations identified
Ericsson ConsumerLab: Connected lifestyles’ expectations identified
 
Ericsson Review: HSPA evolution for future mobile-broadband needs
Ericsson Review: HSPA evolution for future mobile-broadband needsEricsson Review: HSPA evolution for future mobile-broadband needs
Ericsson Review: HSPA evolution for future mobile-broadband needs
 
The sharing economy
The sharing economyThe sharing economy
The sharing economy
 
Internet of Things from a Networked Society perspective
Internet of Things from a Networked Society perspectiveInternet of Things from a Networked Society perspective
Internet of Things from a Networked Society perspective
 
Consumers and the Internet of Things
Consumers and the Internet of ThingsConsumers and the Internet of Things
Consumers and the Internet of Things
 
API Market Demand
API Market DemandAPI Market Demand
API Market Demand
 
Ericsson Review: The benefits of self-organizing backhaul networks
Ericsson Review: The benefits of self-organizing backhaul networksEricsson Review: The benefits of self-organizing backhaul networks
Ericsson Review: The benefits of self-organizing backhaul networks
 
Ericsson Technology Review, issue #1, 2016
Ericsson Technology Review, issue #1, 2016Ericsson Technology Review, issue #1, 2016
Ericsson Technology Review, issue #1, 2016
 
Scaling Internet of Things
Scaling Internet of ThingsScaling Internet of Things
Scaling Internet of Things
 
Ericsson ConsumerLab: The one-click ideal - Presentation
Ericsson ConsumerLab: The one-click ideal - PresentationEricsson ConsumerLab: The one-click ideal - Presentation
Ericsson ConsumerLab: The one-click ideal - Presentation
 

Semelhante a MDCT audio coding with pulse vector quantizers

Performance bounds for unequally punctured terminated convolutional codes
Performance bounds for unequally punctured terminated convolutional codesPerformance bounds for unequally punctured terminated convolutional codes
Performance bounds for unequally punctured terminated convolutional codeseSAT Journals
 
Analysis of LDPC Codes under Wi-Max IEEE 802.16e
Analysis of LDPC Codes under Wi-Max IEEE 802.16eAnalysis of LDPC Codes under Wi-Max IEEE 802.16e
Analysis of LDPC Codes under Wi-Max IEEE 802.16eIJERD Editor
 
AREA OPTIMIZED FPGA IMPLEMENTATION FOR GENERATION OF RADAR PULSE COM-PRESSION...
AREA OPTIMIZED FPGA IMPLEMENTATION FOR GENERATION OF RADAR PULSE COM-PRESSION...AREA OPTIMIZED FPGA IMPLEMENTATION FOR GENERATION OF RADAR PULSE COM-PRESSION...
AREA OPTIMIZED FPGA IMPLEMENTATION FOR GENERATION OF RADAR PULSE COM-PRESSION...VLSICS Design
 
Reduced Energy Min-Max Decoding Algorithm for Ldpc Code with Adder Correction...
Reduced Energy Min-Max Decoding Algorithm for Ldpc Code with Adder Correction...Reduced Energy Min-Max Decoding Algorithm for Ldpc Code with Adder Correction...
Reduced Energy Min-Max Decoding Algorithm for Ldpc Code with Adder Correction...ijceronline
 
Spatialization Parameter Estimation in MDCT Domain for Stereo Audio
Spatialization Parameter Estimation in MDCT Domain for Stereo AudioSpatialization Parameter Estimation in MDCT Domain for Stereo Audio
Spatialization Parameter Estimation in MDCT Domain for Stereo AudioCSCJournals
 
Report AdvancedCodingFinal - Pietro Santoro
Report AdvancedCodingFinal - Pietro SantoroReport AdvancedCodingFinal - Pietro Santoro
Report AdvancedCodingFinal - Pietro SantoroPietro Santoro
 
An efficient reconfigurable code rate cooperative low-density parity check co...
An efficient reconfigurable code rate cooperative low-density parity check co...An efficient reconfigurable code rate cooperative low-density parity check co...
An efficient reconfigurable code rate cooperative low-density parity check co...IJECEIAES
 
Performance bounds for unequally punctured
Performance bounds for unequally puncturedPerformance bounds for unequally punctured
Performance bounds for unequally puncturedeSAT Publishing House
 
Performances Concatenated LDPC based STBC-OFDM System and MRC Receivers
Performances Concatenated LDPC based STBC-OFDM System and MRC Receivers  Performances Concatenated LDPC based STBC-OFDM System and MRC Receivers
Performances Concatenated LDPC based STBC-OFDM System and MRC Receivers IJECEIAES
 
Decoding of the extended Golay code by the simplified successive-cancellation...
Decoding of the extended Golay code by the simplified successive-cancellation...Decoding of the extended Golay code by the simplified successive-cancellation...
Decoding of the extended Golay code by the simplified successive-cancellation...TELKOMNIKA JOURNAL
 
Cycle’s topological optimizations and the iterative decoding problem on gener...
Cycle’s topological optimizations and the iterative decoding problem on gener...Cycle’s topological optimizations and the iterative decoding problem on gener...
Cycle’s topological optimizations and the iterative decoding problem on gener...Usatyuk Vasiliy
 
Design and Implementation of Area Optimized, Low Complexity CMOS 32nm Technol...
Design and Implementation of Area Optimized, Low Complexity CMOS 32nm Technol...Design and Implementation of Area Optimized, Low Complexity CMOS 32nm Technol...
Design and Implementation of Area Optimized, Low Complexity CMOS 32nm Technol...IJERA Editor
 
GF(q) LDPC encoder and decoder FPGA implementation using group shuffled beli...
GF(q) LDPC encoder and decoder FPGA implementation using  group shuffled beli...GF(q) LDPC encoder and decoder FPGA implementation using  group shuffled beli...
GF(q) LDPC encoder and decoder FPGA implementation using group shuffled beli...IJECEIAES
 
BER Performance for Convalutional Code with Soft & Hard Viterbi Decoding
BER Performance for Convalutional Code with Soft & Hard  Viterbi DecodingBER Performance for Convalutional Code with Soft & Hard  Viterbi Decoding
BER Performance for Convalutional Code with Soft & Hard Viterbi DecodingIJMER
 
Investigative Compression Of Lossy Images By Enactment Of Lattice Vector Quan...
Investigative Compression Of Lossy Images By Enactment Of Lattice Vector Quan...Investigative Compression Of Lossy Images By Enactment Of Lattice Vector Quan...
Investigative Compression Of Lossy Images By Enactment Of Lattice Vector Quan...IJERA Editor
 
A novel and efficient mixed-signal compressed sensing for wide-band cognitive...
A novel and efficient mixed-signal compressed sensing for wide-band cognitive...A novel and efficient mixed-signal compressed sensing for wide-band cognitive...
A novel and efficient mixed-signal compressed sensing for wide-band cognitive...Polytechnique Montreal
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
 

Semelhante a MDCT audio coding with pulse vector quantizers (20)

Performance bounds for unequally punctured terminated convolutional codes
Performance bounds for unequally punctured terminated convolutional codesPerformance bounds for unequally punctured terminated convolutional codes
Performance bounds for unequally punctured terminated convolutional codes
 
Analysis of LDPC Codes under Wi-Max IEEE 802.16e
Analysis of LDPC Codes under Wi-Max IEEE 802.16eAnalysis of LDPC Codes under Wi-Max IEEE 802.16e
Analysis of LDPC Codes under Wi-Max IEEE 802.16e
 
AREA OPTIMIZED FPGA IMPLEMENTATION FOR GENERATION OF RADAR PULSE COM-PRESSION...
AREA OPTIMIZED FPGA IMPLEMENTATION FOR GENERATION OF RADAR PULSE COM-PRESSION...AREA OPTIMIZED FPGA IMPLEMENTATION FOR GENERATION OF RADAR PULSE COM-PRESSION...
AREA OPTIMIZED FPGA IMPLEMENTATION FOR GENERATION OF RADAR PULSE COM-PRESSION...
 
Reduced Energy Min-Max Decoding Algorithm for Ldpc Code with Adder Correction...
Reduced Energy Min-Max Decoding Algorithm for Ldpc Code with Adder Correction...Reduced Energy Min-Max Decoding Algorithm for Ldpc Code with Adder Correction...
Reduced Energy Min-Max Decoding Algorithm for Ldpc Code with Adder Correction...
 
Spatialization Parameter Estimation in MDCT Domain for Stereo Audio
Spatialization Parameter Estimation in MDCT Domain for Stereo AudioSpatialization Parameter Estimation in MDCT Domain for Stereo Audio
Spatialization Parameter Estimation in MDCT Domain for Stereo Audio
 
Report AdvancedCodingFinal - Pietro Santoro
Report AdvancedCodingFinal - Pietro SantoroReport AdvancedCodingFinal - Pietro Santoro
Report AdvancedCodingFinal - Pietro Santoro
 
Mk3621242127
Mk3621242127Mk3621242127
Mk3621242127
 
An efficient reconfigurable code rate cooperative low-density parity check co...
An efficient reconfigurable code rate cooperative low-density parity check co...An efficient reconfigurable code rate cooperative low-density parity check co...
An efficient reconfigurable code rate cooperative low-density parity check co...
 
www.ijerd.com
www.ijerd.comwww.ijerd.com
www.ijerd.com
 
Performance bounds for unequally punctured
Performance bounds for unequally puncturedPerformance bounds for unequally punctured
Performance bounds for unequally punctured
 
Performances Concatenated LDPC based STBC-OFDM System and MRC Receivers
Performances Concatenated LDPC based STBC-OFDM System and MRC Receivers  Performances Concatenated LDPC based STBC-OFDM System and MRC Receivers
Performances Concatenated LDPC based STBC-OFDM System and MRC Receivers
 
Decoding of the extended Golay code by the simplified successive-cancellation...
Decoding of the extended Golay code by the simplified successive-cancellation...Decoding of the extended Golay code by the simplified successive-cancellation...
Decoding of the extended Golay code by the simplified successive-cancellation...
 
Cycle’s topological optimizations and the iterative decoding problem on gener...
Cycle’s topological optimizations and the iterative decoding problem on gener...Cycle’s topological optimizations and the iterative decoding problem on gener...
Cycle’s topological optimizations and the iterative decoding problem on gener...
 
Design and Implementation of Area Optimized, Low Complexity CMOS 32nm Technol...
Design and Implementation of Area Optimized, Low Complexity CMOS 32nm Technol...Design and Implementation of Area Optimized, Low Complexity CMOS 32nm Technol...
Design and Implementation of Area Optimized, Low Complexity CMOS 32nm Technol...
 
GF(q) LDPC encoder and decoder FPGA implementation using group shuffled beli...
GF(q) LDPC encoder and decoder FPGA implementation using  group shuffled beli...GF(q) LDPC encoder and decoder FPGA implementation using  group shuffled beli...
GF(q) LDPC encoder and decoder FPGA implementation using group shuffled beli...
 
R1 081483
R1 081483R1 081483
R1 081483
 
BER Performance for Convalutional Code with Soft & Hard Viterbi Decoding
BER Performance for Convalutional Code with Soft & Hard  Viterbi DecodingBER Performance for Convalutional Code with Soft & Hard  Viterbi Decoding
BER Performance for Convalutional Code with Soft & Hard Viterbi Decoding
 
Investigative Compression Of Lossy Images By Enactment Of Lattice Vector Quan...
Investigative Compression Of Lossy Images By Enactment Of Lattice Vector Quan...Investigative Compression Of Lossy Images By Enactment Of Lattice Vector Quan...
Investigative Compression Of Lossy Images By Enactment Of Lattice Vector Quan...
 
A novel and efficient mixed-signal compressed sensing for wide-band cognitive...
A novel and efficient mixed-signal compressed sensing for wide-band cognitive...A novel and efficient mixed-signal compressed sensing for wide-band cognitive...
A novel and efficient mixed-signal compressed sensing for wide-band cognitive...
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 

Mais de Ericsson

Ericsson Technology Review: Versatile Video Coding explained – the future of ...
Ericsson Technology Review: Versatile Video Coding explained – the future of ...Ericsson Technology Review: Versatile Video Coding explained – the future of ...
Ericsson Technology Review: Versatile Video Coding explained – the future of ...Ericsson
 
Ericsson Technology Review: issue 2, 2020
 Ericsson Technology Review: issue 2, 2020 Ericsson Technology Review: issue 2, 2020
Ericsson Technology Review: issue 2, 2020Ericsson
 
Ericsson Technology Review: Integrated access and backhaul – a new type of wi...
Ericsson Technology Review: Integrated access and backhaul – a new type of wi...Ericsson Technology Review: Integrated access and backhaul – a new type of wi...
Ericsson Technology Review: Integrated access and backhaul – a new type of wi...Ericsson
 
Ericsson Technology Review: Critical IoT connectivity: Ideal for time-critica...
Ericsson Technology Review: Critical IoT connectivity: Ideal for time-critica...Ericsson Technology Review: Critical IoT connectivity: Ideal for time-critica...
Ericsson Technology Review: Critical IoT connectivity: Ideal for time-critica...Ericsson
 
Ericsson Technology Review: 5G evolution: 3GPP releases 16 & 17 overview (upd...
Ericsson Technology Review: 5G evolution: 3GPP releases 16 & 17 overview (upd...Ericsson Technology Review: 5G evolution: 3GPP releases 16 & 17 overview (upd...
Ericsson Technology Review: 5G evolution: 3GPP releases 16 & 17 overview (upd...Ericsson
 
Ericsson Technology Review: The future of cloud computing: Highly distributed...
Ericsson Technology Review: The future of cloud computing: Highly distributed...Ericsson Technology Review: The future of cloud computing: Highly distributed...
Ericsson Technology Review: The future of cloud computing: Highly distributed...Ericsson
 
Ericsson Technology Review: Optimizing UICC modules for IoT applications
Ericsson Technology Review: Optimizing UICC modules for IoT applicationsEricsson Technology Review: Optimizing UICC modules for IoT applications
Ericsson Technology Review: Optimizing UICC modules for IoT applicationsEricsson
 
Ericsson Technology Review: issue 1, 2020
Ericsson Technology Review: issue 1, 2020Ericsson Technology Review: issue 1, 2020
Ericsson Technology Review: issue 1, 2020Ericsson
 
Ericsson Technology Review: 5G BSS: Evolving BSS to fit the 5G economy
Ericsson Technology Review: 5G BSS: Evolving BSS to fit the 5G economyEricsson Technology Review: 5G BSS: Evolving BSS to fit the 5G economy
Ericsson Technology Review: 5G BSS: Evolving BSS to fit the 5G economyEricsson
 
Ericsson Technology Review: 5G migration strategy from EPS to 5G system
Ericsson Technology Review: 5G migration strategy from EPS to 5G systemEricsson Technology Review: 5G migration strategy from EPS to 5G system
Ericsson Technology Review: 5G migration strategy from EPS to 5G systemEricsson
 
Ericsson Technology Review: Creating the next-generation edge-cloud ecosystem
Ericsson Technology Review: Creating the next-generation edge-cloud ecosystemEricsson Technology Review: Creating the next-generation edge-cloud ecosystem
Ericsson Technology Review: Creating the next-generation edge-cloud ecosystemEricsson
 
Ericsson Technology Review: Issue 2/2019
Ericsson Technology Review: Issue 2/2019Ericsson Technology Review: Issue 2/2019
Ericsson Technology Review: Issue 2/2019Ericsson
 
Ericsson Technology Review: Spotlight on the Internet of Things
Ericsson Technology Review: Spotlight on the Internet of ThingsEricsson Technology Review: Spotlight on the Internet of Things
Ericsson Technology Review: Spotlight on the Internet of ThingsEricsson
 
Ericsson Technology Review - Technology Trends 2019
Ericsson Technology Review - Technology Trends 2019Ericsson Technology Review - Technology Trends 2019
Ericsson Technology Review - Technology Trends 2019Ericsson
 
Ericsson Technology Review: Driving transformation in the automotive and road...
Ericsson Technology Review: Driving transformation in the automotive and road...Ericsson Technology Review: Driving transformation in the automotive and road...
Ericsson Technology Review: Driving transformation in the automotive and road...Ericsson
 
SD-WAN Orchestration
SD-WAN OrchestrationSD-WAN Orchestration
SD-WAN OrchestrationEricsson
 
Ericsson Technology Review: 5G-TSN integration meets networking requirements ...
Ericsson Technology Review: 5G-TSN integration meets networking requirements ...Ericsson Technology Review: 5G-TSN integration meets networking requirements ...
Ericsson Technology Review: 5G-TSN integration meets networking requirements ...Ericsson
 
Ericsson Technology Review: Meeting 5G latency requirements with inactive state
Ericsson Technology Review: Meeting 5G latency requirements with inactive stateEricsson Technology Review: Meeting 5G latency requirements with inactive state
Ericsson Technology Review: Meeting 5G latency requirements with inactive stateEricsson
 
Ericsson Technology Review: Cloud-native application design in the telecom do...
Ericsson Technology Review: Cloud-native application design in the telecom do...Ericsson Technology Review: Cloud-native application design in the telecom do...
Ericsson Technology Review: Cloud-native application design in the telecom do...Ericsson
 
Ericsson Technology Review: Service exposure: a critical capability in a 5G w...
Ericsson Technology Review: Service exposure: a critical capability in a 5G w...Ericsson Technology Review: Service exposure: a critical capability in a 5G w...
Ericsson Technology Review: Service exposure: a critical capability in a 5G w...Ericsson
 

Mais de Ericsson (20)

Ericsson Technology Review: Versatile Video Coding explained – the future of ...
Ericsson Technology Review: Versatile Video Coding explained – the future of ...Ericsson Technology Review: Versatile Video Coding explained – the future of ...
Ericsson Technology Review: Versatile Video Coding explained – the future of ...
 
Ericsson Technology Review: issue 2, 2020
 Ericsson Technology Review: issue 2, 2020 Ericsson Technology Review: issue 2, 2020
Ericsson Technology Review: issue 2, 2020
 
Ericsson Technology Review: Integrated access and backhaul – a new type of wi...
Ericsson Technology Review: Integrated access and backhaul – a new type of wi...Ericsson Technology Review: Integrated access and backhaul – a new type of wi...
Ericsson Technology Review: Integrated access and backhaul – a new type of wi...
 
Ericsson Technology Review: Critical IoT connectivity: Ideal for time-critica...
Ericsson Technology Review: Critical IoT connectivity: Ideal for time-critica...Ericsson Technology Review: Critical IoT connectivity: Ideal for time-critica...
Ericsson Technology Review: Critical IoT connectivity: Ideal for time-critica...
 
Ericsson Technology Review: 5G evolution: 3GPP releases 16 & 17 overview (upd...
Ericsson Technology Review: 5G evolution: 3GPP releases 16 & 17 overview (upd...Ericsson Technology Review: 5G evolution: 3GPP releases 16 & 17 overview (upd...
Ericsson Technology Review: 5G evolution: 3GPP releases 16 & 17 overview (upd...
 
Ericsson Technology Review: The future of cloud computing: Highly distributed...
Ericsson Technology Review: The future of cloud computing: Highly distributed...Ericsson Technology Review: The future of cloud computing: Highly distributed...
Ericsson Technology Review: The future of cloud computing: Highly distributed...
 
Ericsson Technology Review: Optimizing UICC modules for IoT applications
Ericsson Technology Review: Optimizing UICC modules for IoT applicationsEricsson Technology Review: Optimizing UICC modules for IoT applications
Ericsson Technology Review: Optimizing UICC modules for IoT applications
 
Ericsson Technology Review: issue 1, 2020
Ericsson Technology Review: issue 1, 2020Ericsson Technology Review: issue 1, 2020
Ericsson Technology Review: issue 1, 2020
 
Ericsson Technology Review: 5G BSS: Evolving BSS to fit the 5G economy
Ericsson Technology Review: 5G BSS: Evolving BSS to fit the 5G economyEricsson Technology Review: 5G BSS: Evolving BSS to fit the 5G economy
Ericsson Technology Review: 5G BSS: Evolving BSS to fit the 5G economy
 
Ericsson Technology Review: 5G migration strategy from EPS to 5G system
Ericsson Technology Review: 5G migration strategy from EPS to 5G systemEricsson Technology Review: 5G migration strategy from EPS to 5G system
Ericsson Technology Review: 5G migration strategy from EPS to 5G system
 
Ericsson Technology Review: Creating the next-generation edge-cloud ecosystem
Ericsson Technology Review: Creating the next-generation edge-cloud ecosystemEricsson Technology Review: Creating the next-generation edge-cloud ecosystem
Ericsson Technology Review: Creating the next-generation edge-cloud ecosystem
 
Ericsson Technology Review: Issue 2/2019
Ericsson Technology Review: Issue 2/2019Ericsson Technology Review: Issue 2/2019
Ericsson Technology Review: Issue 2/2019
 
Ericsson Technology Review: Spotlight on the Internet of Things
Ericsson Technology Review: Spotlight on the Internet of ThingsEricsson Technology Review: Spotlight on the Internet of Things
Ericsson Technology Review: Spotlight on the Internet of Things
 
Ericsson Technology Review - Technology Trends 2019
Ericsson Technology Review - Technology Trends 2019Ericsson Technology Review - Technology Trends 2019
Ericsson Technology Review - Technology Trends 2019
 
Ericsson Technology Review: Driving transformation in the automotive and road...
Ericsson Technology Review: Driving transformation in the automotive and road...Ericsson Technology Review: Driving transformation in the automotive and road...
Ericsson Technology Review: Driving transformation in the automotive and road...
 
SD-WAN Orchestration
SD-WAN OrchestrationSD-WAN Orchestration
SD-WAN Orchestration
 
Ericsson Technology Review: 5G-TSN integration meets networking requirements ...
Ericsson Technology Review: 5G-TSN integration meets networking requirements ...Ericsson Technology Review: 5G-TSN integration meets networking requirements ...
Ericsson Technology Review: 5G-TSN integration meets networking requirements ...
 
Ericsson Technology Review: Meeting 5G latency requirements with inactive state
Ericsson Technology Review: Meeting 5G latency requirements with inactive stateEricsson Technology Review: Meeting 5G latency requirements with inactive state
Ericsson Technology Review: Meeting 5G latency requirements with inactive state
 
Ericsson Technology Review: Cloud-native application design in the telecom do...
Ericsson Technology Review: Cloud-native application design in the telecom do...Ericsson Technology Review: Cloud-native application design in the telecom do...
Ericsson Technology Review: Cloud-native application design in the telecom do...
 
Ericsson Technology Review: Service exposure: a critical capability in a 5G w...
Ericsson Technology Review: Service exposure: a critical capability in a 5G w...Ericsson Technology Review: Service exposure: a critical capability in a 5G w...
Ericsson Technology Review: Service exposure: a critical capability in a 5G w...
 

Último

TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 

Último (20)

TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 

MDCT audio coding with pulse vector quantizers

  • 1. MDCT AUDIO CODING WITH PULSE VECTOR QUANTIZERS Jonas Svedberg, Volodya Grancharov, Sigurdur Sverrisson, Erik Norvell, Tomas Toftgård, Harald Pobloth, and Stefan Bruhn SMN, Ericsson Research, Ericsson AB 164 80, Stockholm, Sweden ABSTRACT This paper describes a novel audio coding algorithm that is a building block in the recently standardized 3GPP EVS codec [1]. The presented scheme operates in the Modified Discrete Cosine Transform (MDCT) domain and deploys a Split-PVQ pulse coding quantizer, a noise-fill, and a gain control optimized for the quantizer’s properties. A complexity analysis in terms of WMOPS is presented to illustrate that the proposed Split-PVQ concept and dynamic range optimized MPVQ-indexing are suitable for real- time audio coding. Test results from formal MOS subjective evaluations and objective performance figures are presented to illustrate the competitiveness of the proposed algorithm. Index Terms— Audio Coding, MDCT, PVQ, Noise-Fill, EVS 1. INTRODUCTION In the art of high quality speech and audio coding there is a need to efficiently quantize the residual signals after an initial dynamic compression (or prediction) step. At high rates a scalar quantizer (SQ) followed by entropy coding has been in widespread use [2], while at medium rates trained codebooks [3] or lattice vector quantizers (LVQs) [4] have been in use. The main disadvantage with trained codebooks is that they do not scale well to higher rates in terms of storage and search complexity. Scalar quantization with entropy coding has the drawback that the entropy statistics needs to be stored and also trained on the correct material, and the output is by nature variable rate (requiring a rate control loop). For low rate speech coding fixed rate sparse, multi-pulse and phase structured pulse vector quantizers have been in successful use since the early 90’s [5]. At low rates they have provided a good trade-off between performance and implementation complexity in terms of both storage and cycles, however without an extendable structure, they have been difficult to apply to higher bit rates. An alternative to pulse vector quantizers at higher rates are lattice quantizers, such as D8 [6] and rotated E8 [4]. Due to their regular structure they have very good search properties. However, the handling of lattice outliers and the task of indexing a point in an extended lattice and reconstructing the point from the received index may be cumbersome. Further lattice quantizers require entropy coding to provide good performance [7], and they have a dimensional rigidity, using fixed vector lengths such as 8 or 24 [8]. The Pyramid Vector Quantizer (PVQ) is a structured pulse vector quantizer introduced in the 80’s by Fischer [9] [10] (for Laplacian sources). The pyramid structure enables an efficient search, and by limiting the indexing problem to smaller controlled size sub-sections in [11], [12] and [13], it has been shown that it may give a good tradeoff between performance and complexity. When employed in a transform domain codec, the energy focusing property of a pulse based quantizer may at lower rates result in uncovering perceptual artifacts, such as local energy losses, which need to be addressed in a post-processing step. In this paper we first give an overview of the MDCT based codec, followed by a presentation of novel modules used to quantize and post-process the band normalized transform coefficients using a PVQ. Finally, to demonstrate the performance of the introduced Split-PVQ concept and the pulse coding optimized noise-fill, subjective and objective measurements are presented. 2. CODEC OVERVIEW The proposed MDCT audio codec is operating along the lines of the ITU-T G.719 framework [6]. As a coding mode in the EVS codec, it handles sample rates of 16/32/48 kHz (WB/SWB/FB) and is used at bit rates of 24.4, 32 and 64 kbps. An overview of the system is depicted in Figure 1. Similar to the G.719 scheme it operates on overlapping input audio frames )(tx but here a reduced algorithmic delay is achieved by using an alternative asymmetric window [14] which has a smaller overlap compared to the standard sinusoid window. The windowed, time-aliased input )(~ tx is transformed to MDCT[15] coefficients )(ky using DCTIV: ( )( )∑ − = −=      ++= 1 0 1,,0,2/12/1cos)(~)( L n Lk L kttxky 2 π . (1) errgˆ Norm calculation and coding Bit allocation Adaptive windowing and transform Spectrum normalization PVQ gain & shape quantization Norm decoding Bit allocation Inverse transform and overlap-add Spectrum scaling PVQ gain & shape decoding ENCODER DECODER )(tx )(ˆ tx )(ˆ bny )(bny )(ˆ bE )(bB )(ˆ bE )(bB )(ky )(ˆ bny Noise-fill errgˆ Figure 1: Overview of the MDCT based codec system
  • 2. The MDCT spectrum is partitioned into bands, where each band b is associated with a norm factor )(bE , a quantized norm )(ˆ bE and a norm index )(bI . The set of )(bI are input to a bit distribution algorithm to produce a bit allocation )(bB . Each band is normalized using the quantized norm )(ˆ/)()( bEbbn yy = , with [ ]T bbb eysysyb )(,),1(),()( +=y , where bs and be denote the start and end indices of band b . The band structure for 48 kHz is the same as in [6], but a complete list of band structures can be found in [14]. The normalized bands are then encoded using the PVQ described in section 3. Additional fine gain adjustments are done for each band as outlined in section 4. The decoder may reconstruct the encoded spectrum from the encoded parameters by reversing the steps of the encoder. The bands with zero allocated bits 0)( =bB are filled using methods described in section 5. 3. QUANTIZER The normalized MDCT-coefficients in each band are quantized using a Split-PVQ concept. Bands where the allocated bits per band (excluding bits for fine gain coding) )(bR are less than a threshold are directly quantized into an energy normalized PVQ vector nyˆ . Bands with a higher number of allocated bits are first split into smaller target segments using an adaptive split methodology and then each smaller segment is coded. In both cases the resulting codewords are subsequently encoded by a range encoder [16] supporting both uniform and tailor made probability density functions (PDFs). 3.1. Split-PVQ Methodology Pulse position coding often results in large codeword indices, especially for long input vectors, due to the rapidly increasing number of combinations for large dimensions. For low complexity implementations a process of distributing the available bits between shorter sub segments can be used. Earlier examples of binary recursive splitting approaches can be found in [11] and in the indexing of [12]. An issue with those approaches is that the segmentation may result in a split vector with very different segment sizes. This may result in very inefficient positional coding. To demonstrate the inefficiency of non-uniform segmentation, consider a split of a 16 dimensional vector in two ways: A) symmetric (8+8) and B) asymmetric (2+14). Assume there are 2 pulses to code in each segment, then not considering overlap and sign, the number of codewords in each segment is )!(!/! KNKN − , where N is the segment dimension and K is the number of unit pulses. This results in A) 28+28 = 56 combinations, and B) 1+91 = 92 combinations, which motivates the usage of more symmetric segmentations. In the presented scheme, an algorithm for band splitting is activated when the bit budget for a particular band )(bR exceeds a pre-determined threshold. The input vector )(bny is split into uniform (or close to uniform) segments segy in a non-recursive way. The number of segments is calculated as:  )/()( splitcostsegmaxp RRbRN += , (2) where 32=segmaxR is the maximum allowed number of bits for a segment and splitcostR is an experimentally determined split overhead cost. If the number of allocated bits for each segment is close to segmaxR and the variance of the segment energies is high, the number of splits is increased by one, allowing for higher flexibility in the segment bit allocation. Each segment’s shape is individually quantized and energy ratios describe the relation between the segment energies. A difficulty in this approach comes from the fact that energy variations between different segments might be large, which is unfavorable for the coding of relative energies. This is managed by calculating angles representing energy ratios between a left and right group of segments, with as equal number of segments as possible, using a recursive top-down approach. At level l of the recursion the angle is determined as ( ) ,arctan      = l L l R l EEa where l LE and l RE are the energies of the left and right groups. In case of even pN , the first level angle will represent the energy ratio between a left and right group with an equal number of segments. At each step of the recursion additional angles within each group are determined until the left and the right groups consist of one segment each. In case of an odd number of splits, the angle calculated in the first iteration will have a larger number of segments in the right group than in the left. The angles are additionally used to recursively distribute bits to the already determined segments. At each level of recursion the bits l LR and l RR allocated to the left and the right groups are determined from the available number of bits l R (after angle coding), the lengths of the groups l LL and l RL (number of coefficients), and the angle l α by: ( )( ) ., 1 tanlog2 l L ll Rl L l R l R ll l L RRR LL LR R −= + ⋅− = a (3) E.g. if the allocated bits for the PVQ )(bR require a split into three segments, the angle calculation and bit distribution is done as follows: Step 0: The angle 0 α is used to distribute the bits 0 R , i.e. )(bR minus the bits used to code 0 α , to the left and the right groups of segments ( 0 LR and 0 RR ). The left group includes just one segment while the right group includes two segments. Step 1: The right group from Step 0 is split in two groups where the angle 1 Rα is used to distribute the remaining bits of 0 RR (after coding of 1 Rα ) to 1 RLR (the second segment), 1 RRR (the third segment). The bits 0 LR are directly used to code the first segment. 3.2. Normalization and Shape Search For quantization of each split segment segy we employ a unit hyper-sphere normalization approach similar to the concept in [8], however now applied in the PVQ-context (as in [12]). The quantized split segment ),,(ˆ KNgsegsegy is defined as: KNKN KN segsegseg gKNg ,, , ),,(ˆ zz z y T = , (4) 512,,1,64,,1,: 1 0 , 22 ==         == ∑ − = KNKee N i iKNz , (5)
  • 3. where KN,z is an integer point on the surface of an N -dimensional hyper-pyramid such that the L1 norm of KN,z is K , and segg is the segment gain ( )cos(α or )sin(α at the lowest split level ). The goal of the PVQ search is to minimize the shape error between the split segments segy and segyˆ . The search is performed by first using a projection [9] to a lower sub-pyramid, followed by an iterative approach (as described in [12]) of evaluating the addition of a single unit pulse to each dimension until the target pyramid is reached. Further, the search keeps track of the previous pyramid’s maximum accumulated amplitude, to speed up the innermost loop in a fixed point implementation. 3.3. Split-PVQ Codeword Encoding For each split angleα , angle bits are allocated based on the total number of bits allocated for the two groups of segments and the length RL LL + . The angles are uniformly quantized and then encoded by the range encoder. Efficient coding is achieved by modelling the angle statistics using an asymmetric triangular PDF, where the asymmetry is given by the ratio RL LL / . The KN ,z vector is then indexed using a method that has similarities to both the product code enumeration in [11] and the optimized Fischer recursion in [12]. Two shape codewords are composed as follows: a first codeword representing the first sign encountered in the vector independent of its position; a second codeword representing (in a recursive fashion) all the remaining pulses in the remaining vector which is now guaranteed to have a leading positive pulse. The second codeword is enumerated using the structure displayed in Table 1. The recursive structure defines the ),( KNU offset matrix and enables the recursion computations to stay within the 1−B dynamics of a B bits signed integer. Lead value Section size Section definition K 1 The all pulses consumed case; zeroes in remaining dimensions 1 1  −K ),(2 KNU⋅ All initial pulse amplitude cases with a subsequent new leading sign, (positive or negative). 0 0  ),1( KNNMPVQ − The no initial pulse consumed cases; the current leading sign is kept for the next dimension. Table 1: Modular-PVQ (MPVQ) indexing structure From Table 1 we find that the total number of entries (with the very first leading sign information removed) can be expressed as: ),1(),(21),( KNNKNUKNN MPVQMPVQ −+⋅+= (6) Combining (6) with Fischer’s PVQ-recursion in [9], we find that the total number of entries can be expressed as: )1,(),(1),( +++= KNUKNUKNNMPVQ (7) Runtime computed values of the ),( KNU matrix may now be used as the basis for the MPVQ-indexing and the update of the symmetric U matrix from row 1−N to row N can be performed as: ),()1,1(),1(1)1,( KNUKNUKNUKNU ++−+−+=+ , (8) with initial conditions, 0),1(),0()1,()0,( ==== KUKUNUNU . The two short codewords of size 1 and size 1−B are finally sent to the range encoder using a uniform PDF. 4. GAIN CONTROL The Split-PVQ is a pure shape quantizer and a fine gain is needed to scale the quantized shape to the correct level. To determine the fine gain the allocated bits )(bB are split into bits for the PVQ quantizer )(bR and bits for a fine gain quantizer )(bRFG according to:  ( ) ,)()()( 7,/)(xam)( ))(()(      −= = = bRbBbR LbBbR bRbitsbR FG bsamp sampFGFG (9) where )( sampFG Rbits is a lookup-table for fine gain bits. After quantizing the shape vector, a fine gain prediction is made using an accuracy measure )(ba based on the bandwidth bL , the total number of pulses used in the Split-PVQ encoding )(bN pulses and the highest identified integer pulse amplitude )(max bP in the set of encoded shape vectors KN,z , forming )(ˆ bny , according to:    ⋅= −= .)(/)()( )(/05.01)( bPLbNba babg maxbpulses pred (10) For bands where 0)( >bRFG , a fine gain prediction error is computed as )()()( bgbgbg predfgerr = , where the fine gain is 1 ))(ˆ)(ˆ()(ˆ)()( − ⋅= bbbbbg n T nn T nfg yyyy , and )(ˆ bny has been normalized to unit root mean square (RMS). The gain prediction error )(bgerr is encoded with a non-uniform scalar quantizer in logarithmic domain using )(bRFG bits. The decoder reconstructs the fine gain adjustment through: )()(ˆ)(ˆ bgbgbg prederrfg ⋅= , (11) with 1)(ˆ =bgerr when 0)( =bRFG . The final scaling of the reconstructed band is obtained as: )(ˆ)(ˆ)(ˆ)(ˆ bgbEbb fgn ⋅⋅= yy . (12) 5. NOISE-FILL The use of a pulse based VQ may give a peaky and sparse fine structure which may be less suitable for noise-like signals. The main spectral filling concept is to use the encoded fine structure to fill the unencoded bands. This way a similarity to the surrounding fine structure is preserved. In case of low spectral stability at lower SWB rates, a two-stage densification procedure is used, as illustrated in Figure 2. Figure 2: Dual spectral fill codebook generation Quantized band Non-quantized band 1Y 2Y Compression & Inclusion decision Densification
  • 4. A compression of the coded residual vectors is performed as:    = ≠ = ,0ˆ,0 0ˆ)),(ˆ(sign )(ˆ , n nn compn y yky ky (13) where )(ˆ kyn refers to elements in )(ˆ bny . The virtual codebook which constitutes the spectral codebook is built from only the non- sparse sub-vectors of )(ˆ kyn,comp where each sub-vector has a length of 8. Compressed sub-vectors that do not fulfill a sparsity criterion: 8 ,...,0,2)(ˆ 78 8 , b is isk compn L iky b b =≥∑ ++ += , (14) are rejected. The remaining compressed sub-vectors are concatenated into codebook )(1 kY , of length YL . For a second codebook )(2 kY , YLk ,,0 = a densification process is used: ( ) ( )     =− ≠−+× = .0)(,)( 0)(,)()()(sign )( 11 1111 2 kkL kkLkk k Y Y YY YYYY Y (15) Figure 3 depicts the filling procedure, where )(1 kY is used below the crossover frequency cbf = 4.8 kHz and )(2 kY is used above. The spectral fill above the last encoded band is handled the same way as in [6] for 64 kbps, while for 24.4 and 32 kbps there are complementary spectral fill methods in use, as described in [14]. Quantized band Filled bandFilled band cbf 1Y 2Y 2Y1Y Figure 3: Spectral filling from two codebooks 6. PERFORMANCE 6.1. Objective Performance Figure 4 displays a SNR-comparison for mixed and music content in relation to an RE8+D8 based lattice quantizer [6]. The evaluated Split-PVQ includes both the shape and the fine gain as the lattice quantizer captures both shape and gain of the quantized vectors. 6 10 14 18 22 26 16 32 48 64 80 96 SegmentalSNR[dB] Bit rate [kbps] Split-PVQ RE8+D8 (G.719) Figure 4: Segmental SNR (per frame) vs. bit rate In Figure 5, the complexity in terms of fixed point WMOPS [17] for the described encoder side quantization algorithms is shown. 0% 50% 100% 24.4 32 48 64 96 (4.3) (5.6) (8.1) (10.0) (12.6) Bit rate[kbps] Split-encoding Pulse-search MPVQ-indexing Range-encoding PVQ-normalization Gain-prediction ( Total [WMOPS] ) Figure 5: Encoding WMOPS vs. bit rate At the decoder side the MPVQ de-indexing complexity is on average 25% heavier than the indexing, and the range decoder complexity is roughly 150% of the range encoding complexity. 6.2. Subjective Evaluation In addition to the EVS selection tests [18], numerous tests by the proponent companies have been performed. The listening environment, levels, etc. in the proponents’ test are described in [19]. The performance of the technologies described in this paper is exemplified by the performance of the EVS codec for WB mixed and music content. Table 2 below gives average results for two P.800 DCR [20] subjective tests of the EVS codec performed by the proponents for WB mixed and music content with 20 listeners in each test. In addition, for SWB mixed and music content, the proponents observed results that are not worse than the original for 64 kbps. For 32 kbps some tests revealed statistical significant improvements compared to G.719@32kbps while other tests only concluded an increase of the average MOS score within the confidence region. It should be noted that for SWB at rates 24.4 and 32 kbps a larger portion of the band is subject to spectral fill using methods which are outside the scope of this paper. Hence, the WB results are deemed more relevant for the techniques described here. Bit rate MOS score 32 kbps G.722.1 EVS 4.18 4.42 64 kbps G.722 EVS 4.20 4.54 Table 2: Subjective MOS scores (WB P.800 DCR) 7. CONCLUSIONS The objective and subjective experiments confirm that the proposed novel MDCT-codec design is superior to several legacy coding schemes. The complexity of the introduced Split-PVQ concept, the dynamic range optimized MPVQ indexing, and the efficient fine gain prediction allows their usage for real-time coding applications. The results from formal MOS subjective evaluations indicate that the designed system including the novel noise-fill strategy, tailored for pulse-coding schemes, provides a significant perceptual improvement.
  • 5. 8. REFERENCES [1] 3GPP TS 26.441, “EVS Codec General Overview”, 2014 [2] 3GPP TS 26.401, “Enhanced aacPlus general audio codec General Description”, 2004 [3] K. Ikeda, T. Moriya, N. Iwakami, S. Miki, ”A design of Twin VQ audio-codec for personal communication systems”, IEEE ICUPC 1995, pp 803-807, 1995 [4] J. Makinen, B. Bessette, S. Bruhn, P. Ojala, R. Salami, A. Taleb, “AMR-WB+: a new audio coding standard for 3rd generation mobile audio services”, ICASSP 2005, pp. ii/1109- ii/1112 , Vol. 2, 2005 [5] R. Salami, C. LaFlamme, J-P. Adoul, D. Massaloux, “A Toll Quality 8 Kb/s Speech Codec for the Personal Communcations System (PCS)”, IEEE Trans. on Vehicular Technology, Vol.43, NO. 3, August 1994 [6] ITU-T Rec. G.719, “Low-complexity full-band audio coding for high-quality conversational applications”, 2008 [7] A. Gerso, R. Gray, ”Vector Quantization and Signal Compression”, Kluwer Academic Publishers, 1992 [8] J.P. Adoul, C. Lamblin, A. Leguader, “Baseband speech Coding at 2400bps using ‘Spherical Vector Quantization’”, ICASSP 1984, pp. 45-48, Vol. 9, 1984 [9] T.R. Fischer, “A Vector Laplacian Quantizer”, Proceedings 21st annual Allerton Conf. on Comm., Control and Computing, October 5-7, pp 501-510, 1983 [10] T. Fischer, “A Pyramid Vector Quantizer,” IEEE Trans. Inf. Theory, vol IT-32, NO. 4, 1986 [11] A.C. Hung, T.H-Y Meng, “Error Resilient Pyramid Vector Quantization for Image Compression”, in Proc. IEEE Int. Conf. Image Processing, 1994, pp 583-587, 1994 [12] J.-M. Valin, T. B. Terriberry, G. Maxwell, “A Full- Bandwidth Audio Codec with Low Complexity and Very Low Delay”, Proc. EUSIPCO, 2009 [13] J-M. Valin, K. Vos , T. Terriberry, “Definition of the Opus Audio Codec”, IETF/RFC6716, September 2012 [14] 3GPP TS 26.445, “EVS Codec Detailed Algorithmic Description”, 2014 [15] H. S. Malvar, “Signal Processing with Lapped Transforms”, Norwood, MA, Artech House, 1992 [16] G. Nigel, N. Martin, “Range encoding: An algorithm for removing redundancy from a digitized message”, Video & Data Recording Conf., 1979 [17] ITU-T Rec. G.191, “Software tools for speech and audio coding standardization”, “BASOP: ITU-T Basic Operators”, 2009 [18] 3GPP Tdoc S4-141065, “Report of the Global Analysis Lab for the EVS Selection Phase”, Dynastat Inc., 2014 [19] 3GPP Tdoc AHEVS-311, “EVS Permanent Document EVS- 8b: Test plans for selection phase including lab task specification”, 2014 [20] ITU-T Rec. P.800, “Methods for Subjective Determination of Transmission Quality”, 1996