Bà 7 Slide Neural Networks (machine learning)

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (4.39 MB, 50 trang )

Neural Networks

1

Neural Function

•

Brain function (thought) occurs as the result of the firing of neurons

•

Neurons connect to each other through synapses, which propagate action
potential (electrical impulses) by releasing neurotransmitters

–

Synapses can be excitatory (potential-increasing) or inhibitory (potentialdecreasing), and have varying activation thresholds

–

Learning occurs as a result of the synapses’ plasticicity: They exhibit long-term
changes in connection strength

•

There are about 1011 neurons and about 1014 synapses in the
human brain!

Based on slide by T. Finin, M. desJardins, L Getoor, R. Par

2

Biology of a Neuron

3

Brain Structure

•

Different areas of the brain have different functions

–

Some areas seem to have the same function in all humans (e.g., Broca’s region for motor
speech); the overall layout is generally consistent

–

Some areas are more plastic, and vary in their function; also, the lower-level structure
and function vary greatly

•

We don’t know how different functions are
“assigned” or acquired

–

Partly the result of the physical layout / connection to inputs (sensors) and outputs
(effectors)

–

•

Partly the result of experience (learning)

We really don’t understand how this neural structure leads to what we perceive as
“consciousness” or “thought”
4

The “One Learning Algorithm” Hypothesis

Somatosensor y
Cortex

Auditory Cortex

Auditory cortex learns to see
[Roe et al., 1992]

Somatosensory cortex learns to
see
[Metin & Frost, 1989]

Based on slide by Andrew Ng

5

Sensor Representations in the Brain

Seeing with your tongue

Human echolocation (sonar)

Haptic belt: Direction sense

Implanting a 3rd eye

[BrainPort; Welsh & Blasch, 1997; Nagel et al., 2005; Constantine-Paton & Law, 2009]

Slide by Andrew Ng

6

Comparison of computing power

INFORMATION CIRCA 2012

Computer

Human Brain

Computation Units

10-core Xeon: 109 Gates

1011 Neurons

Storage Units

109 bits RAM, 1012 bits disk

1011 neurons, 1014 synapses

Cycle time

10-9 sec

10-3 sec

Bandwidth

109 bits/sec

1014 bits/sec

•
•

Computers are way faster than neurons…
But there are a lot more neurons than we can reasonably model in modern digital
computers, and they all fire in parallel

•
•

Neural networks are designed to be massively parallel
The brain is effectively a billion times faster

7

Neural Networks

•
•
•
•

Origins: Algorithms that try to mimic the brain.
Very widely used in 80s and early 90s; popularity diminished in late 90s.
Recent resurgence: State-of-the-art technique for many applications
Artificial neural networks are not nearly as complex or intricate as the actual
brain structure

Based on slide by Andrew Ng

8

Neural networks

Output units

Hidden units Input

units
Layered feed-forward network

•
•
•

Neural networks are made up of nodes or units, connected by links
Each link has an associated weight and activation level
Each node has an input function (typically summing over weighted inputs), an
activation function, and an output

Based on slide by T. Finin, M. desJardins, L Getoor, R. Par

9

Neuron Model: Logistic Unit
2
“bias unit”

x0

x =

x0 = 1

6
4

x0
x

1

2
7
✓=

x2
x3

✓0

6
4

✓0
✓1

7

✓2
✓3

✓1

✓2
✓3

h

✓

|
(x) = g (✓ x)
1
=

1 + e—✓T x
1
Sigmoid (logistic) activation function:

Based on slide by Andrew Ng

g(z) =

1 + e—z
10

Neural Network

bias units

x0

a(2)
0

h✓(x)

Slide by Andrew Ng

Layer 1

Layer 2

Layer 3

(Input Layer)

(Hidden Layer)

(Output Layer)

11

Feed-Forward Process

•

Input layer units are set by some exterior function (think of these as sensors),
which causes their output links to be activated at the specified level

•

Working forward through the network, the input function of each unit is applied
to compute the input value

–

Usually this is just the weighted sum of the activation on the links feeding into this
node

•

The activation function transforms this input function into a final
value

–

Typically this is a nonlinear function, often a sigmoid
function corresponding to the “threshold” of that node

Based on slide by T. Finin, M. desJardins, L Getoor, R. Par

12

Neural Network
⇥(1)

⇥(2)

ai (j) = “activation” of unit i

h✓(x)

in layer j

Θ(j) = weight matrix controlling function mapping from layer j
to layer j + 1

If network has sj units in layer j and s j+1 units in layer j+1, then Θ(j) has dimension
s j+1 ×

Slide by Andrew Ng

(s j +1)

⇥(1)

.

R3⇥4

⇥(2)

R1⇥4

13

Vectorization
a

(2)

=g

1

⇥

(1)
10

x

0

+⇥

(1)
11

x

1

(1)

+⇥

12

x

2

+⇥

(1)
13

x

3

⇣
a

(2)

=g

2

⌘
⇥

(1)
20

x

0

+⇥

(1)
21

x

1

(1)

+⇥

22

x

2

+⇥

(1)
23

x

(2)

=g

3

(1)
30

x

0

+⇥

(1)
31

x

1

(1)

+⇥

32

x

2

+⇥

(1)
33

x

3

⇥

(x) = g

1

⌘
z

(2)
2

⇣
=g

⇣
h

(2)

⇣

⌘
⇥

z

=g

3

⇣
a

=g

⌘
z

(2)
3

⌘
⇥

(2)
10

a

(2)

0

+⇥

(2)
11

a

(2)
1

+⇥

(2)
12

a

(2)
2

+⇥

(2)
13

a

(2)

3

⇣
=g

z

(3)
1

h✓(x)

⇥(1)

⇥(2)
14

Based on slide by Andrew Ng

Other Network Architectures
s= [3, 3, 2, 1]

h✓(x)

Layer 1

Layer 2

Layer

Layer 4

3

L denotes the number of layers
s
+L
N
contains the numbers of nodes at each layer

–
–

Not counting bias units
Typically, s0 = d

(# input features) and s L - 1 =K

(# classes)

15

Multiple Output Units:

Pedestrian

One-vs-Rest

Car

M o torcycle

Truck

h⇥(x) 2 R

We want:

2

2

1
h

⇥

(x) ⇡

6
6
4

when pedestrian
Slide by Andrew Ng

0

7
h

⇥

(x) ⇡

6
4

0
1

2
7
h

⇥

(x) ⇡

0

6
4

0

K

2
7
h

⇥

(x) ⇡

6
4

0
0

0

0

1

0

0

0

0

1

when car

when motorcycle

7

when truck
16

Multiple Output Units:

One-vs-Rest

h⇥ (x) 2 R

We want:

2

2

1
h

⇥

(x) ⇡

6

6
4

7

0

h

⇥

(x) ⇡

6
4

0

when pedestrian

•
•

0
1

2
7
h

⇥

0

6
4

(x) ⇡

0

0

when car

K

0

2
7
h

⇥

(x) ⇡

6
4

1

0

when motorcycle

0
0

7

0

0

when truck

1

Given {( x 1 ,y 1 ), (x2,y2), ..., ( x n ,y n )}
Must convert labels to 1-of-K representation
2

– e.g.,

y
i

= 6
4

0
1
0

Based on slide by Andrew Ng

0

7

when motorcycle,

2

y

i

= 6
4

1
0
0

7

when car, etc.
17

Neural Network Classification
Given:
{( x 1 ,y1 ), (x2,y2), ..., ( x n ,y n )}

s

N

+L

s0 = d (# features)

–

Binary classification
y = 0 or 1
1 output unit (s L-1 = 1)

Slide by Andrew Ng

contains # nodes at each layer

Multi-class classification (K classes)
y 2R

K

e.g.

,

pedestrian

,

car

,

motorcycle

truck

K output units (s L-1 = K)

18

Understanding Representations

19

Representing Boolean Functions
Logistic / Sigmoid Function

Simple example: AND

g(z)

-30
+20
+20

hΘ(x) = g(-30 + 20x1 + 20x2)

Based on slide and example by Andrew Ng

h✓(x)

xx1
1

xx2
2

0

0

g(-30) ≈ 0

0

1

g(-10) ≈ 0

1

0

g(-10) ≈ 0

1

1

g(10) ≈ 1

hΘ(x)

20

Representing Boolean Functions

OR

AND
-10

-30
+20

h✓(x)

+20

h✓(x)

+20

+20

(NOT x1) AND (NOT x2)

NOT
+10

+10

h✓(x)

-20

-20

h✓(x)

-20

21

Combining Representations to Create Non-Linear
Functions
(NOT x1) AND (NOT x2)

AND

OR

-30

+10

+20

h✓(x)

+20

-10

-20

+20

h✓(x)

-20

h✓(x)

+20

not(XOR)
-10

II

I

-30
+20
+20
+10

in I

+20
in III

III

IV

-20

h✓(x)
+20

I or III

-20
Based on example by Andrew Ng

22

Layering Representations
x 1 ... x20
x21 ... x40
x41 ... x60

...
x381 ... x400
20 × 20 pixel images
d = 400

10 classes

Each image is “unrolled” into a vector x of pixel intensities
23

Layering Representations
x1

“0”
x2
“1”
x3
x4
“9”
x5
Hidden Layer

xd

Input Layer

Output Layer

Visualization of Hidden
Layer
24

Neural Network Learning

25

Bà 7 Slide Neural Networks (machine learning)

Tài liệu liên quan

Tài liệu bạn tìm kiếm đã sẵn sàng tải về