Tải bản đầy đủ (.pdf) (749 trang)

yu, f. t. s. (2001). introduction to information optics

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (35.83 MB, 749 trang )

Introduction
to
Information
Optics
This page intentionally left blank
Introduction
to
Information
Optics
Edited
by
FRANCIS
T. S. YU
The
Pennsylvania State University
SHIZHUO
YIN
The
Pennsylvania State University
A
Horcourt
Science
and
Technology
Company
San
Diego
San
Francisco
New


York Boston
London Sydney Tokyo
SUGANDA JUTAMULIA
Blue
Sky
Research
®
ACADEMIC PRESS
This book
is
printed
on
acid-free
paper.
(S)
Copyright
©
2001
by
Academic Press
All
rights reserved
No
part
of
this publication maybe
reproduced
or
transmitted
in any

form
or by any
means,
electronic
or
mechanical, including photocopying, recording,
or
stored
in any
information
storage
and
retrieval
system without permission
in
writing
from
the
publisher.
Requests
for
permission
to
make copies
of any
part
of the
work should
be
mailed

to:
Permissions
Department,
Harcourt
Brace
&
Company, 6277
Sea
Harbor Drive, Orlando, Florida, 32887-6777.
ACADEMIC
PRESS
A
Harcourt Science
and
Technology
Company
525
B
Street Suite 1900,
San
Diego, California 92101-4495,
USA

ACADEMIC
PRESS
24-28
Oval
Road,
London
NW1

7DX,
UK
Library
of
Congress Cataloging Number: 2001089409
ISBN:
0-12-774811-3
PRINTED
IN THE
UNITED
STATES
OF
AMERICA
01
02 03 04 05 SB
987654321
Contents
Preface
Chapter
1
Entropy Information
and
Optics
I
1.1.
Information Transmission
2
1.2.
Entropy
Information

4
1.3
Communication Channel
9
1.3.1.
Memoryless Discrete Channel
10
1.3.2.
Continuous Channel
11
1.4.
Band-limited Analysis
19
1.4.1.
Degrees
of
Freedom
23
1.4.2.
Gabor's
Information Cell
25
1.5.
Signal Analysis
26
1.5.1.
Signal
Detection
28
1.5.2.

Statistical Signal Detection
29
1.5.3. Signal Recovering
32
1.5.4.
Signal Ambiguity
34
1.5.5.
Wigner Distribution
39
1.6.
Trading
Information with Entropy
41
1.6.1.
Demon Exorcist
42
1.6.2.
Minimum
Cost
of
Entropy
45
1.7. Accuracy
and
Reliability Observation
47
1.7.1.
Uncertainty Observation
51

1.8.
Quantum Mechanical Channel
54
1.8.1. Capacity
of a
Photon Channel
56
References
60
Exercises
60
vi
Contents
Chapter
2.
Signal
Processing
with Optics
67
2.1.
Coherence Theory
of
Light
67
2.2.
Processing under Coherent
and
Incoherent
Illumination
7

2
2.3.
Fresnel-Kirchhoff
and
Fourier Transformation
76
2.3.1.
Free
Space
Impulse Response
76
2.3.2.
Fourier Transformation
by
Lenses
77
2.4.
Fourier Transform Processing
79
2.4.1.
Fourier Domain Filter
79
2.4.2.
Spatial Domain Filter
82
2.4.3. Processing with
Fourier
Domain Filters
83
2.4.4.

Processing with Joint Transformation
85
2.4.5.
Hybrid Optical Processing
88
2.5.
image
Processing
with
Optics
89
2.5.1.
Correlation Detection
89
2.5.2.
Image Restoration
93
2.5.3.
Image Subtraction
98
2.5.4. Broadband Signal Processing
98
2.6.
Algorithms
for
Processing
103
2.6.1.
Mellin-Transform Processing
104

2.6.2.
Circular Harmonic Processing
105
2.6.3.
Homomorphic Processing
107
2.6.4.
Synthetic Discriminant Algorithm
108
2.6.5.
Simulated Annealing Algorithm
112
2.7.
Processing
with
Photorefractive Optics
i
15
2.7.1.
Photorefractive
Eifect
and
Materials
115
2.7.2.
Wave Mixing
and
Multiplexing
118
2.7.3.

Bragg
Diffraction
Limitation
121
2.7.4.
Angular
and
Wavelength Selectivities
122
2.7.5.
Shift-Invariant
Limited Correlators
125
2.8.
Processing with Incoherent Light
131
2.8.1.
Exploitation
of
Coherence
131
2.8.2.
Signal Processing with White Light
135
2.8.3.
Color Image Preservation
and
Pseudocoloring
138
2.9.

Processing with Neural Networks
141
2.9.1.
Optical Neural Networks
142
2.9.2.
Holpfield
Model
143
2.9.3.
Inpattern
Association Model
144
References
147
Exercises
148
Chapter
3,
Communication with Optics
163
3.1. Motivation
of
Fiber-Optic Communication
163
3.2.
Light Propagation
in
Optical
Fibers

164
3.2.1.
Geometric
Optics
Approach
164
3.2.2.
Wave-Optics Approach
164
3.2.3.
Other
Issues Related
to
Light Propagating
in
Optical
Fiber
168
3.3.
Critical Components
184
3.3.1.
Optical Transmitters
for
Fiber-Optic
Communications

Semiconductor Lasers
184
Contents

VI
i
3.3.2.
Optical Receivers
for
Fiber-Optic Communications
188
3.3.3.
Other Components Used
in
Fiber-Optic Communications
192
3.4.
Fiber-Optic
Networks
192
3.4.1.
Types
of
Fiber-Optic
Networks Classified
by
Physical Size
193
3.4.2.
Physical Topologies
and
Routing Topologies
Relevant
to

Fiber-Optic
Networks
193
3.4.3.
Wavelength Division Multiplexed Optics Networks
193
3.4.4.
Testing Fiber-Optic Networks
195
References
198
Exercises
198
Chapter
4.
Switching with Optics
20!
4.1.
Figures
of
Merits
for an
Optical
Switch
202
4.2.
All-Optical Switches
203
4.2.1.
Optical Nonlinearity

205
4.2.2. Etalon Switching Devices
205
4.2.3. Nonlinear Directional Coupler
208
4.2.4. Nonlinear
Interferometric
Switches
211
4.3.
Fast
Electro-optic
Switches:
Modulators
219
4.3.1.
Direct Modulation
of
Semiconductor Lasers
220
4.3.2.
External Electro-optic Modulators
225
4.3.3. MEMS Switches Without Moving
Parts
236
4.4.
Optical Switching Based
on
MEMS

236
4.4.1.
MEMS Fabrications
237
4.4.2.
Electrostatic
Actuators
238
4.4.3.
MEMS Optical Switches
242
4.5.
Summary
247
References
248
Exercises
250
Chapter
5.
Transformation with Optics
255
5.1.
Huygens-
Fresnel
Diffraction
256
5.2.
Fresnel Transform
257

5.2.1.
Definition
257
5.2.2.
Optical
Fresnel Transform
257
5.3.
Fourier Transform
259
5.4.
Wavelet
Transform
260
5.4.1 Wavelets
260
5.4.2.
Time-frequency Joint Representation
261
5.4.3.
Properties
of
Wavelets
262
5.5. Physical Wavelet Transform
264
5.5.1.
Electromagnetic Wavelet
264
5.5.2.

Electromagnetic Wavelet Transform
266
5.5.3.
Electromagnetic
Wavelet Transform
and
Huygens
Diffraction
260
5.6.
Wigner Distribution Function
270
5.6.1.
Definition
270
5.6.2. Inverse Transform
271
Vlll
Contents
5.6.3.
Geometrical Optics Interpretation
271
5.6.4.
Wigner Distribution Optics
272
5.7.
Fractional Fourier Transform
275
5.7.1.
Definition

275
5.7.2.
Fractional
Fourier
Transform
and
Fresnel Diffraction
277
5.8.
Hankel
Transform
279
5.8.1.
Fourier Transform
in
Polar
Coordinate System
279
5.8.2.
Hankel Transform
281
5.9.
Radon Transform
282
5.9.1.
Definition
282
5.9.2.
Image Reconstruction
283

5.10. Geometric Transform
284
5.10.1.
Basic
Geometric
Transformations
288
5.10.2.
Generalized Geometric Transformation
288
5.10.3.
Optical Implementation
289
5.11.
Hough Transform
292
5.11.1.
Definition
292
5.11.2.
Optical Hough Transform
293
References
294
Exercises
295
Chapter
6
Interconnection
with

Optics
299
6.1.
Introduction
299
6.2.
Polymer Waveguides
303
6.2.1.
Polymeric Materials
for
Waveguide Fabrication
303
6.2.2.
Fabrication
of
Low-Loss Polymeric Waveguides
305
6.3.2. Waveguide Loss Measurement
310
6.3.
Thin-Film Waveguide Couplers
312
6.3.1.
Surface-Normal
Grating
Coupler Design
and
Fabrication
312

6.3.2.
45°
Surface-Normal Micromirror Couplers
326
6.4. Integration
of
Thin-Film Photodetectors
331
6.5.
Integration
of
Vertical Cavity Surface-Emitting Lasers (VCSELs)
334
6.6. Optical Clock Signal Distribution
339
6.7.
Polymer Waveguide-Based
Optical
Bus
Structure
343
6.7.1.
Optical
Equivalent
for
Electronic
Bus
Logic Design
345
6.8.

Summary
348
References
349
Exercises
351
Chapter
7
Pattern Recognition
with
Optics
355
7.1.
Basic Architectures
356
7.1.1.
Correlators
356
7.1.2.
Neural Networks
357
7.1.3.
Hybrid
Optical
Architectures
357
7.1.4.
Robustness
of JTC 362
7.2.

Recognition
by
Correlation Detections
364
7,2.1.
Nonconventional Joint-Transform Detection
364
Contents
IX
7.2.2.
Nonzero-order Joint-Transform Detection
368
7.2.3.
Position-Encoding Joint-Transform Detection
370
7.2.4.
Phase-Representation
Joint-Transform
Detection
371
7.2.5.
Iterative Joint-Transform Detection
372
7.3.
Polychromatic Pattern Recognition
375
7.3.1.
Detection
with Temporal
Fourier-Domain

Filters
376
7.3.2.
Detection
with
Spatial-Domain Filters
377
7.4.
Target Tracking
380
7.4.1.
Autonomous Tracking
380
7.4.2.
Data Association Tracking
382
7.5.
Pattern Recognition Using Composite Filtering
387
7.5.1.
Performance
Capacity
388
7.5.2.
Quantization Performance
390
7.6.
Pattern Classification
394
7.6.1.

Nearest Neighbor Classifiers
395
7.6.2.
Optical Implementation
398
7.7.
Pattern Recognition
with
Photorefractive Optics
401
7.7.1.
Detection
by
Phase
Conjugation
401
7.7.2.
Wavelength-Multiplexed
Matched Filtering
404
7.7.3.
Wavelet Matched Filtering
407
7.8. Neural
Pattern
Recognition
411
7.8.1.
Recognition
by

Supervised Learning
412
7.8.2.
Recognition
by
Unsupervised
Learning
414
7.8.3.
Polychromatic Neural Networks
418
References
422
Exercises
423
Chapter
8
Information Storage with Optics
435
8.1.
Digital Information Storage
435
8.2.
Upper Limit
of
Optical Storage Density
436
8.3
Optical Storage Media
438

8.3.J.
Photographic Film
438
8.3.2.
Dichromated Gelatin
439
8.3.3.
Photopolymers
439
8.3.4.
Photoresists
440
8.3.5.
Thermoplastic Film
440
8.3.6.
Photorefractive Materials
441
8.3.7.
Photochromic
Materials
442
8.3.8. Electron-Trapping Materials
442
8.3.9.
Two
Photon-Absorption
Materials
443
8.3.10.

Bacteriorhodospin
444
8.3.11.
Photochemical Hole Burning
444
8.3.12.
Magneto-optic Materials
445
8.3.13.
Phase-Change Materials
446
8.4.
Bit-Pattern Optical Storage
446
8.4.1.
Optical
Tape
447
8.4.2.
Optical Disk
447
8.4.3.
Multilayer Optical Disk
448
8.4.4. Photon-Gating
3-D
Optical
Storage
449
x

Contents
8.4.5.
Stacked-Layer
3-D
Optical Storage
451
8.4.6.
Photochemical Hole-Burning
3-D
Storage
454
8.5.
Holographic Optical Storage
454
8.5.1.
Principle
of
Holography
455
8.5.2.
Plane Holographic Storage
456
8.5.3.
Stacked Holograms
for 3-D
Optical Storage
458
8.5.4.
Volume
Holographic

3-D
Optical Storage
460
8.6.
Near Field Optical Storage
461
8.7. Concluding Remarks
461
References
465
Exercises
468
Chapter
9
Computing with Optics
475
9.1.
Introduction
476
9.2.
Parallel Optical Logic
and
Architectures
477
9.2.1.
Optical
Logic
477
9.2.2.
Space-Variant

Optical
Logic
481
9.2.3.
Programmable Logic
Array
481
9.2.4.
Parallel Array Logic
485
9.2.5.
Symbolic Substitution
486
9.2.6.
Content-Addressable Memory
488
9.3. Number Systems
and
Basic Operations
489
9.3.1.
Operations
with
Binary Number Systems
489
9.3.2.
Operations with Nonbinary Number Systems
499
9.4.
Parallel Signed-Digit Arithmetic

501
9.4.1.
Generalized Signed-Digit Number Systems
501
9.4.2.
MSD
Arithmetic
503
9.4.3.
TSD
Arithmetic
530
9.4.4.
QSD
Arithmetic
534
9.5.
Conversion between
Different
Number Systems
543
9.5.1.
Conversion between Signed-Digit
and
Complement Number Systems
544
9.5.2.
Conversion between
NSD and
Negabinary Number Systems

546
9.6.
Optical Implementation
549
9.6.1.
Symbolic Substitution Implemented
by
Matrix-Vector
Operation
549
9.6.2.
SCAM-Based Incoherent Correlator
for QSD
Addition
551
9.6.3.
Optical Logic Array Processor
for
Parallel
NSD
Arithmetic
558
9.7.
Summary
560
References
562
Exercises
569
Chapter

10
Sensing with Optics
571
10.1.
Introduction
57!
10.2.
A
Brief
Review
of
Types
of
Fiber-Optic
Sensors.
572
10.2.1.
Intensity-Based Fiber-Optic Sensors
572
10.2.2.
Polarization-Based Fiber-Optic Sensors
575
10.2.3.
Phase-Based Fiber Optic Sensors
583
10.2.4.
Frequency
(or
Wavelength)-Based
Fiber-Optic Sensors

587
Contents
XI
10.3.
Distributed Fiber-Optic Sensors
589
10.3.1.
Intrinsic Distributed Fiber-optic Sensors
589
10.3.2.
Quasi-distributed Fiber-optic Sensors
600
10.4.
Summary
612
References
613
Exercises
615
Chapter
11
Information Display with Optics
617
11.1.
I
ntrod
action
617
11.2.
Information Display Using Acousto-optic Spatial Light Modulators 1618

11.2.1.
The
Acousto-optic
Effect
618
11.2.2.
Intensity Modulation
of
Laser
625
11.2.3.
Deflection
of
Laser
628
11.2.4.
Laser
TV
Display Using Acousto-optic Devices
629
11.3.
3-D
Holographic
Display
632
11.3.1.
Principles
of
Holography
632

11.3.2.
Optical Scanning Holography
638
11.3.3.
Synthetic Aperture Holography
640
11.4.
Information Display Using Electro-optic Spatial Light Modulators
643
11.4.1.
The
Electro-optic
Effect
643
11.4.2.
Electrically Addressed Spatial Light Modulators
647
11.4.3.
Optically Addressed Spatial Light Modulators
650
11.5.
Concluding Remarks
661
References
661
Exercises
664
Chapter
12
Networking with

Optics
667
12.1.
Background
667
12.2.
Optical Network Elements
671
12.2.1.
Optical Fibers
671
12.2.2.
Optical
Amplifiers
673
12.2.3. Wavelength Division Multiplexer/Demultiplexer
678
12.2.4.
Transponder
687
12.2.5.
Optical Add/Drop Multiplexer
689
12.2.6.
Optical Cross-Connect
691
12.2.7.
Optical Monitoring
694
12.3.

Design
of
Optical Transport Network
696
12.3.1.
Optical
Fiber
Dispersion
Limit
696
12.3.2.
Optical
Fiber Nonlinearity Limit
697
12.3.3.
System Design Examples
701
12.4.
Applications
and
Future Development
of
Optical Networks
704
12.4.1.
Long-haul Backbone Networks
704
12.4.2.
Metropolitan
and

Access Networks
709
12.4.3.
Future Development
711
References
714
Exercises
715
Index
719
This page intentionally left blank
Preface
We
live
in a
world bathed
in
light. Light
is not
only
the
source
of
energy
necessary
to
live

plants grow

up by
drawing energy
from
sunlight; light
is
also
the
source
of
energy
for
information
- our
vision
is
based
on
light detected
by
our
eyes (but
we do not
grow
up by
drawing energy
from
light
to our
body
through

our
eyes). Furthermore, applications
of
optics
to
information technol-
ogy
are not
limited
to
vision
and can be
found
almost everywhere.
The
deployment rate
of
optical technology
is
extraordinary.
For
example, optical
fiber
for
telecommunication
is
being installed about
one
kilometer every second
worldwide.

Thousands
of
optical disk players, computer monitors,
and
televi-
sion displays
are
produced daily.
The
book summarizes
and
reviews
the
state
of the art in
information optics,
which
is
optical
science
in
information technology.
The
book
consists
of 12
chapters written
by
active researchers
in the field.

Chapter
1
provides
the
theoretical relation between optics
and
information theory. Chapter
2
reviews
the
basis
of
optical signal processing. Chapter
3
describes
the
principle
of fiber
optic communication.
Chapter
4
discusses
optical
switches used
for
communi-
cation
and
parallel processing. Chapter
5

discusses integral transforms, which
can be
performed optically
in
parallel. Chapter
6
presents
the
physics
of
optical
interconnects used
for
computing
and
switching. Chapter
7
reviews
pattern
recognition including neural networks based
on
optical
Fourier transform
and
other optical techniques. Chapter
8
discusses optical storage including holo-
graphic memory,
3D
memory,

and
near-field optics. Chapter
9
reviews
digital
XIV
Preface
optical computing, which takes advantage
of
parallelism
of
optics. Chapter
10
describes
the
principles
of
optical
fiber
sensors. Chapter
11
introduces
ad-
vanced
displays including
3D
holographic display. Chapter
12
presents
an

overview
of fiber
optical
networks.
The
book
is not
intended
to be
encyclopedic
and
exhaustive. Rather,
it
merely
reflects
current selected interests
in
optical applications
to
information
technology.
In
view
of the
great number
of
contributions
in
this area,
we

have
not
been able
to
include
all of
them
in
this book.
Chapter
1
Entropy Information
and
Optics
Francis T.S.
Yu
THE
PENNSYLVANIA STATE UNIVERSITY
Light
is not
only
the
mainstream
of
energy that supports
life;
it
also provides
us
with

an
important source
of
information.
One can
easily imagine that
without
light,
our
present civilization would never have emerged. Furthermore,
humans
are
equipped exceptionally
good
(although
not
perfect) eyes,
along
with
an
intelligent brain. Humans were therefore able
to
advance themselves
above
the
rest
of the
animals
on
this planet Earth.

It is
undoubtedly true that
if
humans
did not
have eyes, they would
not
have evolved into their present
form.
In the
presence
of
light, humans
are
able
to
search
for the
food
they need
and the art
they enjoy,
and to
explore
the
unknown. Thus light,
or
rather
optics,
provide

us
with
a
very
valuable source
of
information,
the
application
of
which
can
range
from
very
abstract artistic interpretations
to
very
efficient
scientific
usages.
This
chapter
discusses
the
relationship between entropy information
and
optics.
Our
intention

is not to
provide
a
detailed discussion, however,
but to
cover
the
basic fundamentals that
are
easily applied
to
optics.
We
note that
entropy
information
was not
originated
by
optical scientists,
but
rather
by a
group
of
mathematically oriented electrical engineers whose
original
interest
was
centered

on
electrical communication. Nevertheless,
from
the
very
begin-
ning
of the
discovery
of
entropy information, interest
in its
application
has
never
totally been
absent
from
the
optical standpoint.
As a
result
of the
recent
development
of
optical communication, signal processing,
and
computing,
among other discoveries,

the
relationship between optics
and
entropy
informa-
tion
has
grown more profound than ever.
2 1.
Entropy Information
and
Optics
1.1.
INFORMATION
TRANSMISSION
Although
we
seem
to
know
the
meaning
of the
word
information,
fundamen-
tally
that
may not be the
case.

In
reality, information
may be
defined
as
related
to
usage. From
the
viewpoint
of
mathematic formalism, entropy information
is
basically
a
probabilistic
concept.
In
other words, without probability theory
there would
be no
entropy information.
An
information transmission system
can be
represented
by a
block
diagram,
as

shown
in
Fig. 1.1.
For
example,
a
message represents
an
information
source which
is to be
sent
by
means
of a set of
written characters that repre-
sent
a
code.
If the set of
written
characters
is
recorded
on a
piece
of
paper,
the
information still cannot

be
transmitted
until
the
paper
is
illuminated
by
a
visible light (the transmitter), which obviously acts
as an
information
carrier. When light reflected
from
the
written characters arrives
at
your
eyes
(the receiver),
a
proper decoding (translating) process takes place;
that
is,
character recognition (decoding)
by the
user (your mind). This
simple
example illustrates that
we can see

that
a
suitable encoding process
may
not be
adequate unless
a
suitable decoding process also takes place.
For
instance,
if
I
show
you a
foreign
newspaper
you
might
not be
able
to
decode
the
language, even though
the
optical channel
is
assumed
to be
perfect

(i.e., noiseless). This
is
because
a
suitable decoding
process
requires
a
priori
knowledge
of the
encoding scheme;
for
example,
the
knowledge
of the
foreign
characters. Thus
the
decoding process
is
also known
as
recognition
process.
Information transmission
can be in
fact
represented

by
spatial
and
temporal
information.
The
preceding example
of
transmitting written charac-
ters
obviously represents
a
spatial information transmission.
On the
other
hand,
if the
written language
is
transmitted
by
coded light pulses, then this
COMMUNICATION
CHANNEL
Fig. 1.1. Block
diagram
of a
communication
system.
1.1.

Information
Transmission
3
language should
be
properly (temporally)
encoded;
for
instance,
as
transmitted
through
an
optical
fiber,
which represents
a
temporal communication
channel.
Needless
to
say,
at
this receiving
end a
temporally decoding
process
is
required
before

the
temporal coded language
is
sent
to the
user. Viewing
a
television
show,
for
example, represents
a
one-way
spatial-temporal
transmission.
It is
interesting
to
note that temporal
and
spatial information
can be
traded
for
information
transmission.
For
instance, television signal transmission
is a
typical

example
of
exploiting
the
temporal information transmission
for
spatial
information
transmission.
On the
other hand,
a
movie sound track
is an
example
of
exploiting spatial information transmission
for
temporal
informa-
tion transmission.
Information
transmission
has two
basic disciplines:
one
developed
by
Wiener
[1.1, 1.2],

and the
other
by
Shannon [1.3,
1.4].
Although both Wiener
and
Shannon share
a
common interest, there
is a
basic
distinction between
their
ideas.
The
significance
of
Wiener's work
is
that,
if a
signal (information)
is
corrupted
by
some
physical
means (e.g., noise, nonlinear distortion),
it may

be
possible
to
recover
the
signal
from
the
corrupted one.
It is for
this purpose
that
Wiener advocates correlation detection, optimum prediction,
and
other
ideas.
However, Shannon carries
his
work
a
step
further.
He
shows that
the
signal
can be
optimally transferred provided
it is
properly encoded; that

is, the
signal
to be
transferred
can be
processed before
and
after
transmission.
In
other
words,
it
is
possible
to
combat
the
disturbances
in a
communication
channel
by
properly encoding
the
signal. This
is
precisely
the
reason that Shannon

advocates
the
information measure
of the
signal, communication channel
capacity,
and
signal coding processes.
In
fact,
the
main objective
of
Shannon's
theory
is the
efficient
utilization
of the
information channel.
A
fundamental theorem
proposed
by
Shannon
may be the
most surprising
result
of his
work.

The
theorem
can be
stated approximately
as
follows:
Given
a
stationary
finite-memory
information channel having
a
channel capacity
C,
if
the
binary information transmission
rate
R
(which
can be
either spatial
or
temporal)
of the
signal
is
smaller than
C,
there exists

a
channel encoding
and
decoding process
for
which
the
probability
of
error
per
digit
of
information
transfer
can be
made arbitrarily small. Conversely,
if the
formation trans-
mission
rate
R is
larger than
C,
there exists
no
encoding
and
decoding
processes with

this
property; that
is, the
probability
of
error
in
information
transfer
cannot
be
made arbitrarily small.
In
other words,
the
presence
of
random disturbances
in an
information channel does not,
by
itself,
limit
transmission
accuracy. Rather,
it
limits
the
transmission rate
for

which arbit-
rarily
high transmission accuracy
can be
accomplished.
To
conclude this section,
we
note that
the
distinction between
the two
communication disciplines
are
that Wiener assumes,
in
effect,
that
the
signal
in
question
can be
processed
after
it has
been
corrupted
by
noise.

Shannon
suggests
that
the
signal
can be
processed
both before
and
after
its
transmission
4 1.
Entropy
Information
and
Optics
through
the
communication channel. However, both Wiener
and
Shannon
share
the
same basic objective; namely,
faithful
reproduction
of the
original
signal.

1.2.
ENTROPY INFORMATION
Let us now
define
the
information measure, which
is one of the
vitally
important aspects
in the
development
of
Shannon's information theory.
For
simplicity,
we
consider discrete input
and
output message ensembles
A =
{aj
and
B
=
{bj},
respectively,
as
applied
to a
communication

channel,
as
shown
in
Fig. 1.2,
If
a
£
is an
input event
as
applied
to the
information channel
and bj
is
the
corresponding transmitted output event, then
the
information measure
about
the
received event
bj
specifies
a,-,
can be
written
as
(1.1)

Where
P(a
i
/b
i
)
is the
conditional probability
of
input event
a
{
depends
on the
output event
b^
P(a
t
)
is the a
priori probability
of
input event
a,-,
/
=1,2, ,
M
andj=
1,
2, ,

JV.
By
the
symmetric property
of the
joint probability,
we
show
that
I(a
t
;
bj)
=
l(bj;
a
t
).
(1.2)
In
other words,
the
amount
of
information transferred
by the
output event
bj
from
a,

is the
same amount
as
provided
by the
input event
a,
that specified
bj,
It is
clear
that,
if the
input
and
output events
are
statistically
independent;
that
is,
if
P(a
it
bj)
=
P(a
i
)P(b
j

),
then
/(«,.;
bj)
=
0
Furthermore,
if
/(a,•;£>,•)
> 0,
then
P(a
i
,b
j
)
>
P(a
i
)P(b
j
),
there
is a
higher
joint probability
of
a
i
and bj.

However,
if
I(a
i
;b
i
)<Q,
then
P(a,-,&,-)
<
P(a
;
)P(i»j),
there
is
lower joint probability
of
a,
and
b
f
.
INFORMATION
CHANNEL
Fig. 1.2.
An
input-output
communication channel.
1.2.
Entropy

Information
5
As
a
result
of the
conditional probabilities,
P(a
i
/b
i
]
^
1 and
P^/a,)
^
1, we
see
that
I(a
t
;
bj)
^
I(a
t
),
(1.3)
and
/(a,.;

ft,.)
^
I(bj),
(1.4)
where
J(
fl£
)
£
-Iog
2
P(a,.),
7(5
;
.)
4
-Iog
2
P(^0,
/(a,)
and
/(b
;
-)
are
defined
as the
respective
input
and

output
self-information
of
event
a
(
-
and
event
fej. In
other words,
7(a
f
)
awrf
I(bj) represent
the
amount
of
information
provided
at the
input
and
output
of the
information channel
of
event
^

and
event
bj,
respectively.
It
follows that
the
mutual information
of
event
a
i
and
event
bj is
equal
to the
self-information
of
event
a,-
if and
only
if
P(a
i
/b
j
)
= 1,

then
/(a,;
bj)
=
/(«,).
(1.5)
It
is
noted
that,
if Eq.
(1.5)
is
true
for all
i;
that
is, the
input ensemble, then
the
communication channel
is
noiseless. However,
if
P(5
J
-/a,)
=
1,
then

I(a
i
;b
J
)=I(b
J
).
(1.6)
If
Eq.
(1.6)
is
true
for all the
output ensemble, then
the
information channel
is
deterministic.
To
conclude this section,
we
note that
the
information measure
as
defined
in
the
preceding

can be
easily extended
to
higher product spaces, such
as
Since
the
measure
of
information
can be
characterized
by
ensemble average,
the
average amount
of
information provided
at the
input
end can be
written
as
- £
P(a)
Iog
2
P(a)
4
H(A),

(1.8)
A
where
the
summation
is
over
the
input ensemble
A.
6
1.
Entropy Information
and
Optics
Similarly,
the
average
amount
of
self-information
provided
at the
output
end
can be
written
as
I(B)
£

-X
P(b)
Iog
2
P(a)
4
H(B\
(1.9)
B
These
two
equations
are
essentially
the
same
form
as the
entropy equation
in
statistical thermodynamics,
for
which
the
notations
H(A)
and
H(B)
are
frequently

used
to
describe
information
entropy.
As we
will
see
later,
indeed
H(A]
and
H(B)
provide
a
profound relationship between
the
information
entropy
and the
physical entropy.
It is
noted that entropy
H,
from
the
communication theory point
of
view,
is a

measure
of
uncertainty. However,
from
the
statistical thermodynamic standpoint,
H is a
measure
of
disorder.
In
addition,
we see
that
H(A)^Q,
(1.10)
where
P(a)
is
always
a
positive quantity.
The
equality
of Eq.
(1.10)
holds
if
P(a)
=

1 or
P(d)
— 0.
Thus,
we can
conclude that
H(A)^log
2
M,
(1.11)
where
M is the
number
of
input events.
The
equality holds
for
equiprobability
input
events; that
is,
P(a)

1/M.
It's
trivial
to
extend
the

ensemble average
to
the
conditional entropy:
I(B/
A)
=
-I
£
P(a,
b)
Iog
2
p(b/a)
4
H(B/a).
(1.12)
B
A
And
the
entropy
of the
product ensemble
AB can
also
be
written
as
H(AB)

=
- £ I
p(a,
b)
Iog
2
p(b/a).
(1.1
3)
A B
H(AB)
=
H(A)
+
H(B/A).
(1.14)
H(A/B)
=
H(B)
+
H(A/B).
(1.15)
In
view
of
\nu
^
u

1, we

also
see
that
H(B/a)^H(B),
(1.16)
H(A/B)
<
H(A),
(1.17)
1.2.
Entropy
Information
7
where
the
equalities
hold
if and
only
if a and b are
statistically independent,
Let us now
turn
to the
definition
of
average
mutual
information.
We

consider
first the
conditional average mutual information:
b.
1,18
Although
the
mutual information between input event
and
output event
can
be
negative,
/(a;
b) < 0, the
average conditional mutual information
can
never
be
negative:
I(A-b)^0,
(1.19)
with
the
equality holding
if and
only
if
events
A are

statistically independent
of
b;
that
is,
p(a/b)
=
p(a),
for all a.
By
taking
the
ensemble average
of Eq.
(1.19),
the
average mutual informa-
tion
can be
defined
as
I(a;B)±%p(b)I(A'b).
(1.20)
B
Equation (1.20)
can be
written
as
p(
fl

;fe)/(
fl
;b).
(1.21)
A B
Again
we see
that
I(A;B)^0.
(1.22)
The
equality holds
if and
only
if a and b are
statistically independent.
Moreover,
from
the
symmetric property
of
/(a;
b), we see
that
I(A;B)
=
I(B;A).
(1.23)
In
view

of
Eqs.
(1.3)
and
(1.4),
we
also
see
that
I(A;
B)
^
H(A)
=
l(A),
(1.24)
I(A;
B)
^
H(B)
=
l(B\
(1.25)
8 1.
Entropy
Information
and
Optics
which
implies that

the
mutual information (the amount
of
information trans-
fer)
cannot
be
greater than
the
entropy information (the amount
of
information
provided)
at the
input
or the
output ends, whichever comes
first. We
note that
if
the
equality holds
for Eq.
(1.24)
then
the
channel
is
noiseless;
on the

other
hand,
if the
equality holds
for Eq.
(1.25),
then
the
channel
is
deterministic.
Since
Eq.
(1.13)
can be
written
as
H(AB)
=
H(A)
+
H(B)
-
I(A;
B),
(1.26)
we
see
that,
I(A;

B) =
H(A)
-
H(A/B),
(1.27)
I(A;
B)
=
H(B)
-
H(B/A),
(1.28)
where
H(A/B)
represents
the
amount
of
information
loss (e.g.,
due to
noise)
or
the
equivocation
of the
channel, which
is the
average amount
of

information
needed
to
specify
the
noise disturbance
in the
channel.
And
H(B/A)
is
referred
to as the
noise entropy
of the
channel.
To
conclude this section,
we
note that
the
entropy information
can be
easily
extended
to
continuous product space, such
as
p(a)log
2

p(a)da,
(1.29)
C
p(b)log
2P
(b)db,
(1.30)
C
H(B/A)
&
-
p(a,
b)
Iog
2
p(b/a)da
db,
(1.31)
J —
oo
J —
oc)
/*oo
Too
H(A/B)
*
-
p(a,b)log
2
p(a/b)dadb,

(1.32)
and
H(AB)
£
- I I
p(a,
b)log
2
p(a,
b)dadb,
(1.33)
where
the
p's
are the
probability density distributions.
1.3. Communication
Channel
9
1.3.
COMMUNICATION CHANNEL
An
optical communication channel
can be
represented
by an
input-output
block diagram,
for
which

an
input event
a can be
transferred into
an
output
event
fr, as
described
by a
transitional
probability
p(b/a}.
Thus,
the
input-
output
transitional probability
P(B/A)
describes
the
random noise disturban-
ces
in the
channel.
Information
channels
are
usually described according
to the

type
of
input-
output
ensemble
and are
considered
discrete
or
continuous.
If
both
the
input
and
output
of the
channel
are
discrete events (discrete
spaces),
then
the
channel
is
called
a
discrete channel.
But if
both

the
input
and
output
of the
channel
are
represented
by
continuous events,
the
channel
is
called
a
continuous channel.
However,
a
channel
can
have
a
discrete input
and a
continuous
output,
or
vice
versa.
Accordingly,

the
channel
is
then called
a
discrete-continuous
or
continu-
ous-discrete
channel.
We
note again that
the
terminology
of
discrete
and
continuous com-
munication
channels
can
also
be
extended
to
spatial
and
temporal domains.
This concept
is of

particular importance
for
optical communication
channels.
A
communication channel can,
in
fact, have multiple inputs
and
multiple
outputs.
If the
channel possesses only
a
single input terminal
and a
single
output
terminal,
it is a
one-way channel. However,
if the
channel possesses
two
input
terminals
and two
output terminals,
it is a
two-way channel.

It is
trivial.
One can
have
a
channel with
n
input
and m
output terminals.
Since
a
communication channel
is
characterized
by the
input-output
transitional
probability distribution
P(B/A),
if the
transitional probability
distribution
remains
the
same
for all
successive input
and
output events,

then
the
channel
is a
memoryless channel. However,
if the
transitional probability
distribution
changes
with
the
preceding events, whether
at the
input
or the
output, then
the
channel
is a
memory channel. Thus,
if the
memory
is
finite;
that
is,
if the
transitional probability depends
on a finite
number

of
preceding
events,
the
channel
is a finite-memory
channel. Furthermore,
if the
transitional
probability
distribution depends
on
stochastic processes
and the
stochastic
processes
are
assumed
to be
nonstationary, then
the
channel
is a
nonstationary
channel.
Similarly,
if the
stochastic
processes
the

transitional
probability
depends
on are
stationary,
then
the
channel
is a
stationary channel.
In
short,
a
communication
channel
can be
fully
described
by the
characteristics
of its
transitional probability distribution;
for
example,
a
discrete nonstationary
memory
channel.
Since
a

detailed discussion
of
various communication channels
is
beyond
the
scope
of
this chapter,
we
will
evaluate
two of the
simplest,
yet
important,
channels
in the
following
subsection.
10 1.
Entropy Information
and
Optics
1.3,1.
MEMORYLESS DISCRETE CHANNEL
For
simplicity,
we let an
input message

to be
transmitted
to the
channel
be
a" =
aja
2
•••«„,
and the
corresponding output message
be
where
a,-
and
/?,
are any one of the
input
and
output events
of A and B,
respectively.
Since
the
transitional probabilities
for a
memoryless
channel
do not
depend

on the
preceding events,
the
composite transitional probability
can be
written
as
Thus,
the
joint probability
of the
output message
p
1
"
is
PCS")
= I
P(oc
where
the
summation
is
over
the
A"
product space.
In
view
of

entropy information measure,
the
average mutual information
between
the
input
and
output messages (sequences)
of a" and
/?"
can be
written
as
I(A
H
;
B")
=
H(B")
-
H(B
n
/A"l
(1.34)
where
B"
is the
output product space,
for
which

H(B")
can be
written
as
H(B
n
)
=

B"
The
conditional entropy
H(B"/A")
is
H(B
n
/A")
=

£
P(a")P(p>")
Iog
2
P(j8"/a-).
(1.35)
A"
B
n
Since
I(A";

B")
represents
the
amount
of
information provided
by
then
n
output
events about
the
given
n
input events,
I(A
n
;
B")/n
is the
amount
of
mutual
information
per
event.
If the
channel
is
assumed memoryless,

I(A";
B
n
)/n
is
only
a
function
of
P(a")
and n.
Therefore,
the
capacity
of the
channel would
be the

×