Tải bản đầy đủ (.pdf) (14 trang)

GSM and UMTS (P14)

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (259.81 KB, 14 trang )

Chapter 14: Voice Codecs
Kari Ja
¨
rvinen
1
14.1 Overview
Five voice codecs have been standardised for GSM. These are:

Full-Rate (FR) codec

Half-Rate (HR) codec

Enhanced Full-Rate (EFR) codec

Adaptive Multi-Rate (AMR) codec

Adaptive Multi-Rate Wideband (AMR-WB) codec
All voice codecs include speech coding (source coding), channel coding (error protection
and bad frame detection), concealment of erroneous or lost frames (bad frame handling),
Voice Activity Detection (VAD), and a low bit rate source controlled mode for coding
background noise. The codecs operate either in the GSM full-rate traffic channel at the
gross bit rate of 22.8 kbit/s (FR, EFR, AMR-WB), or in the half-rate channel at the gross
bit rate of 11.4 kbit/s (HR), or in both (AMR). AMR and AMR-WB have also been specified
for use in 3G WCDMA.
The FR codec [1] was the first voice codec defined for GSM. The codec was standardised in
1989. It uses 13.0 kbit/s for speech coding and 9.8 kbit/s for channel coding. FR is the default
codec to provide speech service in GSM.
The HR codec [2] was developed to bring channel capacity savings through operation in
the half-rate channel. The codec was standardised in 1995. It operates at 5.6 kbit/s speech
coding bit rate with 5.8 kbit/s used for channel coding. The codec provides the same level of
speech quality as the FR codec, except in background noise and in tandem (two encodings in


MS-to-MS calls) where the performance is somewhat lower.
The EFR codec [3] was the first codec to provide digital cellular systems with voice quality
equivalent to that of a wireline telephony reference (ITU G.726-32 ADPCM standard at 32
kbit/s). The EFR codec brings substantial quality improvement over the previous GSM
codecs. EFR was standardised first for the GSM based PCS 1900 system in the US during
1995 and was adopted to GSM in 1996. The EFR codec uses 12.2 kbit/s for speech coding and
10.6 kbit/s for channel coding.
A further development in GSM voice quality was the standardisation of the AMR codec [4]
in 1999. AMR offers substantial improvement over EFR in error robustness in the full-rate
1
The views expressed in this chapter are those of the author and do not necessarily reflect the views of his
affiliation entity.
GSM and UMTS: The Creation of Global Mobile Communication
Edited by Friedhelm Hillebrand
Copyright q 2001 John Wiley & Sons Ltd
ISBNs: 0-470-84322-5 (Hardback); 0-470-845546 (Electronic)
channel by adapting speech and channel coding depending on channel conditions. Channel
capacity is gained by switching to operate in the half-rate channel during good channel
conditions. The AMR codec includes several modes for use both in the full- and half-rate
channel. The speech coding bit rates are between 4.75 and 12.2 kbit/s in the full-rate channel
(eight modes) and between 4.75 and 7.95 kbit/s in the half-rate channel (six modes). The
AMR codec was adopted in 1999 by 3GPP as the default speech codec to the 3G WCDMA
system.
The AMR-WB codec [5] is the most recent voice codec. It was standardised in 2001 for
both GSM and 3G WCDMA systems. Later in 2001, rapporteur’s meeting of ITU-T Q.7/16
choose the AMR-WB speech codec for the new ITU-T wideband coding algorithm of speech
at around 16 kbit/s. AMR-WB is an adaptive multi-rate codec like the AMR (narrowband)
codec. AMR-WB brings quality improvement through the use of extended audio bandwidth.
While all previous codecs in digital cellular systems operate on narrow audio bandwidth
limited below 3.4 kHz, AMR-WB extends the bandwidth to 7 kHz. Wideband coding brings

improved voice quality especially in terms of increased voice naturalness. AMR-WB consists
of nine modes operating at speech coding bit rates between 6.6 and 23.85 kbit/s.
A voice codec related development in GSM and 3G WCDMA was the definition of in-band
Tandem Free Operation (TFO). This feature was completed including TFO for AMR in
March 2001 [23]. TFO brings improvement in speech quality for MS-to-MS calls by avoiding
double transcoding in the network. TFO can be employed when the same speech codec is
used at both ends of the call.
The speech coding part in all the voice codecs is based on the use of Linear Predictive
Coding (LPC). All except the FR codec belong to the class of speech coding algorithms
generally known as Code Excited Linear Prediction (CELP). All codecs operate at the
sampling rate of 8 kHz except AMR-WB which uses 16 kHz sampling rate. Channel coding
in all codecs is based on convolution coding for error correction combined with Cyclic
Redundancy Check (CRC) for error detection. Three protection classes are typically used:
bits protected by the convolutional code and CRC, bits protected by the convolutional code
alone, and bits without any error protection.
The voice codec specifications define the speech codec bit-exactly to guarantee high basic
voice quality. For bad frame handling, only an example solution is given to allow the
possibility for implementation-specific performance improvements in error concealment.
Tables 14.1–14.3 give a summary of the GSM voice codecs: standards, implementation
complexity, and algorithmic delay.
14.2 Codec Selection Process
The development of GSM voice codecs has been carried out in ETSI SMG11 and in its
predecessors. Finalisation of channel coding has taken place under SMG2. The AMR-WB
codec was developed jointly by SMG11 and 3GPP TSG-SA WG4.
All the voice codecs have been chosen through a competitive selection process among
several candidate codec algorithms.
Before the codec selection process starts, speech quality performance requirements and
codec design constraints (e.g. implementation complexity and transmission delay) have to be
defined. For the most recent codecs (AMR and AMR-WB), the launch of standardisation has
been preceded by a feasibility study phase to validate the new codec concept.

GSM and UMTS: The Creation of Global Mobile Communication372
Chapter 14: Voice Codecs 373
Table 14.1 Voice codec standards
Codec Year of
standard
Speech coding
bit-rate
(in kbit/s)
System/traffic
channel
Speech coding algorithm
FR codec 1989 13.0 GSM FR Regular Pulse Excitation –
Long Term Prediction
(RPE-LTP)
HR codec 1995 5.6 GSM HR Vector-Sum Excited Linear
Prediction (VSELP)
EFR codec 1996 12.2 GSM FR Algebraic Code Excited
Linear Prediction (ACELP)
AMR codec 1999 12.2, 10.2,
7.95, 7.4, 6.7,
5.9, 5.15, 4.75
GSM FR
(all eight modes),
GSM HR
(six lowest modes),
3G WCDMA
(all modes)
Algebraic Code Excited
Linear Prediction (ACELP)
AMR-WB

codec
2001 23.85, 23.05,
19.85, 18.25,
15.85, 14.25,
12.65, 8.85,
6.60
GSM FR
(seven lowest modes),
EDGE (all modes),
3G WCDMA
(all modes)
Algebraic Code Excited
Linear Prediction (ACELP)
Table 14.2 Implementation complexity of voice codecs
Codec Speech codec complexity GSM channel codec complexity
WMOPS Data
RAM
(16-bit
kwords)
Data
ROM
(16-bit
kwords)
Program
ROM
(1000
assembly
instructions)
WMOPS Data
RAM

(16-bit
kwords)
Data
ROM
(16-bit
kwords)
Program
ROM
(1000
assembly
instructions)
FR codec 3.0 1.2 0.1 0.9 1.7 1.7 0.8 0.3
HR codec 18.5 4.4 7.9 4.1 2.7 3.2 0.9 1.3
EFR codec 15.2 4.7 5.3 1.8 See complexity of the FR channel codec
AMR
codec
a
16.8 5.3 14.6 4.9 5.2 (FR),
2.9 (HR)
2.6 (FR),
2.4 (HR)
5.2 1.3
AMR-WB
codec
a
35.4 6.4 9.9 3.8 3.5 2.9 3.2 0.6
a
Complexity of channel quality measurement and mode control is counted as part of channel coding.
The selection process typically consists of two phases: a qualification (pre-selection) phase
and a selection phase. During the qualification phase, the most promising candidate codecs

are chosen to enter the selection phase. The qualification is usually based on in-house
listening tests. In the selection phase, the codec proposals are tested more comprehensively
in several independent test laboratories and using multiple languages. The codec proposals
are implemented in C-code with fixed-point arithmetics. For both phases, the codec propo-
nents need to deliver documentation of their proposal including a justification of meeting all
design constraints. The codec selection is based both on the speech quality of the candidate
codecs and on fulfilling other design requirements.
After codec selection, a verification phase and a characterisation phase will take place. An
optimisation phase may be launched to improve some key performances of the codec if there
is sufficient promise of improvement. During the verification phase, the codec is subjected to
further analysis to verify its suitability for the intended systems and applications. A detailed
analysis of implementation complexity and transmission delay is also carried out during this
phase. The final phase of codec standardisation is the characterisation phase. This is launched
after the approval of the codec standard to characterise the codec in a large variety of
operational conditions. The output is a technical report on performance characterisation
which provides information on codec performance.
14.3 FR Codec
In the FR voice codec, the speech coding part is based on the Regular Pulse Excitation – Long
Term Prediction (RPE-LTP) algorithm [6]. The frame length is 20 ms, i.e. a set of codec
parameters are produced every 20 ms. The speech codec operates at 13.0 kbit/s while 9.8 kbit/
s is used for channel coding. FR is the default codec to provide speech service in GSM.
The FR speech codec carries out short-term LPC analysis once every frame (without any
lookahead over future samples). The rest of the coding is performed in 5 ms sub-frames. The
short-term residual signal, after LPC analysis, is further compressed by using Long-Term
Prediction (LTP) analysis. LTP removes any long-term correlation remaining in the short-
term residual signal. The long-term residual is then decimated into a sparse signal in which
only every third sample has a non-zero value. The non-zero samples are located on a regular
grid. The grid starting position is determined separately for each sub-frame based on the
energy of the sub-frame. This Regular Pulse Excitation (RPE) approach results in rather
efficient coding. Only the non-zero samples in the long-term residual need to be quantised

GSM and UMTS: The Creation of Global Mobile Communication374
Table 14.3 Algorithmic transmission delay components of the speech codecs
Codec Frame
length (ms)
Lookahead in
LPC analysis (ms)
FR codec 20 0
HR codec 20 4.4
EFR codec 20 0
AMR codec 20 5
AMR-WB codec 20 5
and sent to the decoder. The parameters for each 20 ms frame consist of a set of LPC-
coefficients (reflection coefficients) and a set of parameters describing the short-term residual
for each sub-frame (LTP parameters, RPE parameters). A block diagram of the encoder is
shown in Figure 14.1.
The FR channel codec uses convolution coding for protecting the 182 most important bits
out of the 260 bits in each frame [11]. A 3 bit CRC is employed for bad frame detection. The
CRC covers the most important 50 bits.
The FR codec, like all GSM and 3G WCDMA codecs, includes a low bit rate source
controlled mode for coding background noise only (voice activity detection with discontin-
uous transmission). This saves power in the mobile station and also reduces the overall
interference level over the air-interface.
The complexity of the FR speech codec is about 3.0 WMOPS (weighted million operations
per second). The complexity has been estimated from a C-code implemented with a fixed
point function library in which each operation has been assigned a weight representative for
performing the operation on a typical DSP. The channel coding requires about 1.7 WMOPS
[13].
14.4 HR Codec
The HR codec employs the Vector-Sum Excited Linear Prediction (VSELP) speech coding
algorithm [7]. VSELP belongs to the class of CELP codecs. The codec uses 20 ms frame

length. The speech codec operates at the bit rate of 5.6 kbit/s while 5.8 kbit/s is used for
channel coding.
Like most CELP codecs the HR VSELP employs two codebooks: a fixed codebook and an
adaptive codebook. The adaptive codebook is derived from the long-term filter state (and
therefore the content of the codebook changes frame-by-frame). The adaptive codebook is
Chapter 14: Voice Codecs 375
Figure 14.1 Block diagram of the GSM FR speech encoder

Tài liệu bạn tìm kiếm đã sẵn sàng tải về

Tải bản đầy đủ ngay
×