Email:
C9-411 Dai Co Viet str. 1, Hanoi
.c
om
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group
Faculty of Elec. and Telecom, Hanoi University of Science and Technology
ng
th
an
co
ng
Digital Video
cu
u
du
o
Tien Pham Van, Dr. rer. nat.
Hanoi University of Science and
Technology
CuuDuongThanCong.com
/>
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group
Faculty of Elec. and Telecom, Hanoi University of Science and Technology
.c
om
Agenda
Email:
C9-411 Dai Co Viet str. 1, Hanoi
cu
u
du
o
ng
th
an
co
ng
• Introduction to video basics
• Video data presentation and rendering
• Video compression and communication
CuuDuongThanCong.com
/>
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group
Faculty of Elec. and Telecom, Hanoi University of Science and Technology
Email:
C9-411 Dai Co Viet str. 1, Hanoi
cu
u
du
o
ng
th
an
co
ng
.c
om
Introduction to Video
Basics
3
CuuDuongThanCong.com
/>
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group
Faculty of Elec. and Telecom, Hanoi University of Science and Technology
Email:
C9-411 Dai Co Viet str. 1, Hanoi
Types of Video Signals
an
–
–
The primaries can either be RGB or a luminance-chrominance
transformation of them (e.g., YIQ, YUV).
Best color reproduction
Requires more bandwidth and good synchronization of the three
components
ng
–
.c
om
Component video -- each primary is sent as a separate
video signal.
co
•
th
Composite video -- color (chrominance) and luminance
signals are mixed into a single carrier wave.
Some interference between the two signals is inevitable.
•
cu
u
–
du
o
ng
•
S-Video (Separated video, e.g., in S-VHS) -- a compromise
between component analog video and the composite
video. It uses two lines, one for luminance and another
for composite chrominance signal.
4
CuuDuongThanCong.com
/>
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group
Faculty of Elec. and Telecom, Hanoi University of Science and Technology
Analog Video
Email:
C9-411 Dai Co Viet str. 1, Hanoi
.c
om
Analog video is represented as a continuous (time varying) signal; Digital video is
represented as a sequence of digital images
PAL (SECAM) Video
625 scan lines per frame, 25
frames per second (40
msec/frame)
Interlaced, each frame is divided
into 2 fields, 312.5 lines/field
Color representation:
Uses YUV color model
co
an
du
o
ng
th
525 scan lines per frame, 30 fps
(33.37 msec/frame).
Interlaced, each frame is divided
into 2 fields, 262.5 lines/field
20 lines reserved for control
information at the beginning of
each field
So a maximum of 485 lines of
visible data
ng
NTSC Video
cu
u
• Laserdisc and S-VHS have
actual resolution of ~420
lines
• Ordinary TV -- ~320 lines
• Each line takes 63.5
microseconds to scan.
Color representation:
• Uses YIQ color model.
5
CuuDuongThanCong.com
/>
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group
Faculty of Elec. and Telecom, Hanoi University of Science and Technology
Email:
C9-411 Dai Co Viet str. 1, Hanoi
Frame Rate and Interlacing
The basic idea is quite simple, single still frames are presented at a high
enough rate so that persistence of vision integrates these still frames
into motion.
co
–
.c
om
Persistence of vision: The human eye retains an image for a
fraction of a second after it views the image. This property is
essential to all visual display technologies.
ng
•
an
Motion pictures originally set the frame rate at 16 frames per
second. This was rapidly found to be unacceptable and the
frame rate was increased to 24 frames per second. In Europe,
this was changed to 25 frames per second, as the European
power line frequency is 50 Hz.
When NTSC television standards were introduced, the frame
rate was set at 30 Hz (1/2 the 60 Hz line frequency). Movies
filmed at 24 frames per second are simply converted to 30
frames per second on television broadcasting.
•
cu
u
du
o
ng
th
•
6
CuuDuongThanCong.com
/>
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group
Faculty of Elec. and Telecom, Hanoi University of Science and Technology
Email:
C9-411 Dai Co Viet str. 1, Hanoi
Frame Rate and Interlacing
.c
om
Unfortunately, there is no easy way to "put a shutter" in front of a
television broadcast! Therefore, to arrange for two "flashes" per
frame, the flashes are created by interlacing.
•
cu
u
–
du
o
ng
th
an
co
•
For some reason, the brighter the still image presented to the
viewer, the shorter the persistence of vision. So, bright
pictures require more frequent repetition.
If the space between pictures is longer than the period of
persistence of vision -- then the image flickers. Large bright
theater projectors avoid this problem by placing rotating
shutters in front of the image in order to increase the
repetition rate by a factor of 2 (to 48) or three (to 72) without
changing the actual images.
ng
•
With interlacing, the number of "flashes" per frame is two,
and the field rate is double the frame rate. Thus, NTSC
systems have a field rate of 59.94 Hz and PAL/SECAM systems
7
a field rate of 50 Hz.
CuuDuongThanCong.com
/>
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group
Faculty of Elec. and Telecom, Hanoi University of Science and Technology
Email:
C9-411 Dai Co Viet str. 1, Hanoi
.c
om
Scanning Video
cu
u
du
o
ng
th
an
co
ng
• Video is obtained via raster scanning, which transforms a
3-D signal p(x, y, t) into a one-dimensional signal s(t) which
t (time)
can be transmitted.
FrameK
• Progressive scanning: left-to-right and top-to-bottom
– Samples in time: frames/sec
Frame2
– Samples along y: lines
Frame1
– Samples along x: pixels
(only for digital video)
• We perceive the images as
continuous, not discrete:
human visual system
performs the interpolation !
• How many frames, lines, and pixels ? Progressive scanning
CuuDuongThanCong.com
/>
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group
Faculty of Elec. and Telecom, Hanoi University of Science and Technology
Email:
C9-411 Dai Co Viet str. 1, Hanoi
Interlaced Scanning
an
th
ng
A frame
du
o
1
2
3
4
5
6
co
ng
.c
om
• If the frame rate is too slow - > flickering and jagged movements
• Tradeoff between spatial and temporal resolution
– Slow moving objects with high spatial resolution
– Fast moving objects with high frame rate
• Interlaced scanning: scan all even lines, then scan all odd lines.
• A frame is divided into 2 fields (sampled at different time)
M
Even field
1
cu
u
Odd field
2
3
4
5
6
CuuDuongThanCong.com
/>
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group
Faculty of Elec. and Telecom, Hanoi University of Science and Technology
Email:
C9-411 Dai Co Viet str. 1, Hanoi
th
an
co
ng
R
G
cu
u
du
o
ng
• Three basic colors
R: Red
G: Green
B: Blue
A picture
consists of
three images
.c
om
RGB Color Model
CuuDuongThanCong.com
B
/>
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group
Faculty of Elec. and Telecom, Hanoi University of Science and Technology
Email:
C9-411 Dai Co Viet str. 1, Hanoi
YIQ Color Model
u
du
o
ng
th
an
co
ng
.c
om
YIQ color model: used in NTSC color TV
• Y - Luminance containing brightness and detail (monochrome TV)
• To create the Y signal, the red, green and blue inputs to the Y signal
must be balanced to compensate for the color perception misbalance
of the eye.
– Y = 0.3R + 0.59G + 0.11B
• Chrominance
– I = 0.6R – 0.28G - 0.32B (cyan-orange axis)
– Q = 0.21R – 0.52G + 0.31B (purple-green axis)
Y
• Human eyes are most sensitive to Y,
next to I, next to Q.
cu
I
Q
CuuDuongThanCong.com
/>
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group
Faculty of Elec. and Telecom, Hanoi University of Science and Technology
Email:
C9-411 Dai Co Viet str. 1, Hanoi
YUV Color Model
ng
.c
om
• YUV color model: used for PAL TV and CCIR 601 standard
• Same definition for Y as in YIQ model
• Chrominance is defined by U and V – the color differences
an
co
– U=B–Y
– V=R–Y
du
o
ng
th
Y
cu
u
U
V
CuuDuongThanCong.com
/>
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group
Faculty of Elec. and Telecom, Hanoi University of Science and Technology
Email:
C9-411 Dai Co Viet str. 1, Hanoi
.c
om
YCrCb Color Model
du
o
ng
th
an
co
ng
• YCbCr color model: used in JPEG and MPEG
• Closely related to YUV: scaled and shifted YUV
– Cb = ((B – Y)/2) + 0.5
– Cr = ((R – Y)/1.6) + 0.5
cu
u
• Chrominance value in YCbCr are always in the
range of 0 to 1 (normalization)
Make digital processing easy
CuuDuongThanCong.com
/>
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group
Faculty of Elec. and Telecom, Hanoi University of Science and Technology
Email:
C9-411 Dai Co Viet str. 1, Hanoi
.c
om
Color Models in Video (Cont…)
ng
• Color models based on linear transformation from
RGB color space
cu
u
du
o
ng
th
an
co
C = M3x3 x CRGB
CuuDuongThanCong.com
/>
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group
Faculty of Elec. and Telecom, Hanoi University of Science and Technology
Email:
C9-411 Dai Co Viet str. 1, Hanoi
.c
om
Analog NTSC and PAL Video
• NTSC Video: Japan, US, …
ng
th
an
co
ng
- 525 scan lines per frame, 30 frames per second
- Interlaced, each frame is divided into 2 fields, 262.5 lines/field
- 20 lines reserved for control information at the beginning of each field
- So a maximum of 485 lines of visible data
- Color representation: YIQ color model
du
o
• PAL Video: China, UK, …
cu
u
- 625 scan lines per frame, 25 frames per second (40 msec/frame)
Interlaced, each frame is divided into 2 fields, 312.5 lines/field
- Uses YUV color model
- Approximately 20% more lines than NTSC
- NTSC vs. PAL
roughly same bandwidth
CuuDuongThanCong.com
/>
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group
Faculty of Elec. and Telecom, Hanoi University of Science and Technology
Digital Video
Advantages over analog:
th
In a 4:4:4 scheme, each 8×8 matrix of RGB pixels converts to three YCrCb
8×8 matrices: one for luminance (Y) and one for each of the two
chrominance bands (Cr and Cb).
A 4:2:2 scheme also creates one 8×8 luminance matrix but decimates every
two horizontal pixels to create each chrominance-matrix entry. Thus
reducing the amount of data to 2/3rds of a 4:4:4 scheme.
Ratios of 4:2:0 decimate chrominance both horizontally and vertically,
resulting in four Y, one Cr, and one Cb 8×8 matrix for every four 8×8 pixelmatrix sources. This conversion creates half the data required in a 4:4:4
chroma ratio.
–
du
o
ng
–
an
co
ng
Almost all digital video uses component video
The human eye responds more precisely to brightness information than it
does to color, chroma subsampling (decimating) takes advantage of this.
u
•
•
Direct random access --> good for nonlinear video editing
No problem for repeated recording
No need for blanking and sync pulse
.c
om
–
–
–
–
cu
•
Email:
C9-411 Dai Co Viet str. 1, Hanoi
16
CuuDuongThanCong.com
/>
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group
Faculty of Elec. and Telecom, Hanoi University of Science and Technology
Email:
C9-411 Dai Co Viet str. 1, Hanoi
4:2:2
cu
u
du
o
ng
th
an
co
ng
• Chroma subsampling: human visual
system is more sensitive to
luminance than chrominance
We can subsample chrominance
• 4:4:4 – No subsampling
• 4:2:2, 4:1:1 – horizontally subsample
• 4:2:0 – horizontally and vertically
.c
om
Luma Sampling and Chroma Sub-Sampling
4:1:1
CuuDuongThanCong.com
4:2:0
/>
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group
Faculty of Elec. and Telecom, Hanoi University of Science and Technology
Bài tập 1
Email:
C9-411 Dai Co Viet str. 1, Hanoi
cu
u
du
o
ng
th
an
co
ng
.c
om
Cho một ảnh màu kích thước 288x352 pixel, lấy mẫu 4:2:2, giá
trị mỗi điểm ảnh nhận 0…255 (8bit/mẫu).
a/ Tính dung lượng của ảnh
b/ Nén JPEG cho ảnh trên, biết dung lượng ảnh sau nén là 50KB.
Tính tỷ số nén của ảnh?
18
CuuDuongThanCong.com
/>
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group
Faculty of Elec. and Telecom, Hanoi University of Science and Technology
Bài tập 2
Email:
C9-411 Dai Co Viet str. 1, Hanoi
cu
u
du
o
ng
th
an
co
ng
.c
om
Cho một ảnh màu kích thước 288x352 pixel, lấy mẫu 4:2:2, giá
trị mỗi điểm ảnh nhận 0…255 (8bit/mẫu).
Mã hóa JPEG cho ảnh trên, biết tỷ số nén cho ảnh chói Y là 10
lần, tỷ số nén cho các tín hiệu hiệu màu Cb,Cr là 20 lần.
Tính tỷ số nén của ảnh?
19
CuuDuongThanCong.com
/>
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group
Faculty of Elec. and Telecom, Hanoi University of Science and Technology
Email:
C9-411 Dai Co Viet str. 1, Hanoi
Sample Quantization – Pixel Resolution
du
o
ng
th
an
co
ng
.c
om
• Pixel resolution depends quantization levels/bits
• Usually, 8 bits for each luma/chroma sample when no compression
8bits/1byte per pixel for gray image, 24bits/3bytes for true color image
cu
u
Luminace (gray) picture
Num. Level Bit
(a)
2
1 (Monochrome)
(b)
4
2
(c)
8
3
(d)
16
4
(e)
32
5
(f)
64
6
CuuDuongThanCong.com
/>
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group
Faculty of Elec. and Telecom, Hanoi University of Science and Technology
Digital Video
Analog TV is a continuous signal
Digital TV uses discrete numeric values
– Signal is sampled, and samples are quantized
–
160 352
Image represented by pixel array
800
1280
ng
QSIF
(19Kp)
1152
co
120
•
720
Sub-sampling to reduce image resolution or size
.c
om
•
•
Email:
C9-411 Dai Co Viet str. 1, Hanoi
SIF (82Kp)
du
o
486
ng
601 (300Kp)
th
an
240
cu
600
u
SVGA (500Kp)
ATV (1Mp)
720
Workstation (1Mp)
900
HDTV (2Mp)
1080
CuuDuongThanCong.com
/>
1920
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group
Faculty of Elec. and Telecom, Hanoi University of Science and Technology
Freq.
MHz
1050
16:9
P
8
HDTV
Eur, ana
1250
16:9
2.4
P
9
HDTV
NHK
1125
3.3
I
20
NTSC©
525
4:3
7
I
4.2
525
4:3
5
P
4.2
625
4:3
6
I
5.5
625
4:3
4.3
P
5.5
SECAM©
625
4:3
6
I
6
SECAM
625
4:3
4.3
P
6
2.5H
ng
HDTV
USA, ana
Opt.
View
dist
co
Aspect
Ratio
16:9
ng
th
Name
.c
om
P/I
an
Lines
du
o
HDTV
Email:
C9-411 Dai Co Viet str. 1, Hanoi
NTSC
cu
u
PAL©
PAL
© : Conventional
CuuDuongThanCong.com
22
/>
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group
Faculty of Elec. and Telecom, Hanoi University of Science and Technology
Email:
C9-411 Dai Co Viet str. 1, Hanoi
Computer Video Format
–
IRIS video board VINO takes NTSC video signal and after digitization can achieve frame
resolution of 640x480 pixels, 8 bits/pixel and 4 fps.
SunVideo digitizer captures NTSC video signal in the form of an RGB signal with frame
resolution of 320x240 pixels, 8 bits/pixel and 30 fps.
ng
–
.c
om
Depends on the i/p and o/p devices (digitizers) for motion video medium.
Digitizers differ in frame resolution, quantization and frame rate
co
•
•
–
The Color Graphics Adapter (CGA):
an
Computer video controller standards
The Enhanced Graphics Adapter (EGA):
ng
–
th
320 x 240 pixels x 2 bits/pixel = 16,000 bytes (storage capacity per image)
–
du
o
640 x 350 pixels x 4 bits/pixel = 112,000 bytes
The Video Graphics Array (VGA):
640 x 480 pixels x 8 bits/pixel = 307,200 bytes
u
The 8514/A Display Adapter Mode:
cu
–
1024 x 768 pixels x 8 bits/pixel = 786,432 bytes
–
The Extended Graphics Array (XGA):
1024x768 at 256 colors or 640x480 at 65,000 colors
–
The Super VGA (SVGS):
Upto 1024x768 pixels x 24 bits/pixel = 2,359,296 bytes
23
CuuDuongThanCong.com
/>
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group
Faculty of Elec. and Telecom, Hanoi University of Science and Technology
Email:
C9-411 Dai Co Viet str. 1, Hanoi
cu
u
du
o
ng
th
an
co
ng
.c
om
Video Compression
24
CuuDuongThanCong.com
/>
Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group
Faculty of Elec. and Telecom, Hanoi University of Science and Technology
Email:
C9-411 Dai Co Viet str. 1, Hanoi
.c
om
Introduction (1/2)
an
co
ng
• Why video compression technique is
important ?
• One movie video without compression
cu
u
du
o
ng
th
– 720 x 480 pixels per frame
– 30 frames per second
– Total 90 minutes
– Full color
– The full data quantity = 167.96 G bytes !!
25
CuuDuongThanCong.com
/>