14 network robustness and preferential attachment

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (44.87 MB, 67 trang )

CS224W: Analysis of Networks
Jure Leskovec, Stanford University

¡

(1) Power-laws in Networks

¡

(2) Network Robustness

¡

(3) Preferential Attachment

11/8/18

Jure Leskovec, Stanford CS224W: Analysis of Networks,

2

Which interesting graph
properties do we observe
that need explaining?
¡ Small-world model:
§ Diameter
§ Clustering coefficient

¡

Node degree distribution
§ What fraction of nodes has degree ! (as a function of !)?
§ Prediction from simple random graph models:
p(!) = exponential function of !
§ Observation: Often a power-law: & ! ∝ !()

11/8/18

Jure Leskovec, Stanford CS224W: Analysis of Networks,

3

Expected based on Gnp

Found in data

! " ∝ "$%
11/8/18

Jure Leskovec, Stanford CS224W: Analysis of Networks,

4

[Leskovec et al. KDD ‘08]

Take a network, plot a histogram of !(#) vs. #
Probability: &(%) = -(. = %)

¡

11/8/18

Plot: fraction of nodes
with degree %:
| )|*+ = % |
&(%) =
,

Flickr social
network
n= 584,207,
m=3,555,115

Jure Leskovec, Stanford CS224W: Analysis of Networks,

5

[Leskovec et al. KDD ‘08]

¡

Plot the same data on log-log scale:

Probability: :(#) = !(; = #)

! # ∝ # *<.=>

Slope = −5 = 1.75
Flickr social
network
n= 584,207,
m=3,555,115

11/8/18

Jure Leskovec, Stanford CS224W: Analysis of Networks,

How to distinguish:
!(#) ∝ exp(−#) vs.
!(#) ∝ # *+ ?
Take logarithms:
if , = .(/) = 0 *1 then
log , = −/
If , = / *+ then
log , = −5 log(/)
So on log-log axis
power-law looks like
a straight line of
slope −5 !

6

¡

First observed in Internet Autonomous Systems
[Faloutsos, Faloutsos and Faloutsos, 1999]

Internet domain topology

11/8/18

Jure Leskovec, Stanford CS224W: Analysis of Networks,

7

¡

11/8/18

The World Wide Web [Broder et al., 2000]

Jure Leskovec, Stanford CS224W: Analysis of Networks,

8

¡

Other Networks [Barabasi-Albert, 1999]

Actor collaborations

11/8/18

Web graph

Jure Leskovec, Stanford CS224W: Analysis of Networks,

Power-grid

9

p(x)

1

0.6

p ( x) = cx -0.5
p ( x) = cx -1

0.2

p( x) = c - x

¡

11/8/18

20

40

x

60

80

100

Above a certain ! value, the power law is
always higher than the exponential!
Jure Leskovec, Stanford CS224W: Analysis of Networks,

10

[Clauset-Shalizi-Newman 2007]

Power-law vs. Exponential
on log-log and semi-log (log-lin) scales

¡

0

0

10

1

10

2

3

10

10

1

2

3

4

5

6

10

p ( x) = cx -0.5

p ( x) = cx -0.5

p ( x) = cx -1

p ( x) = cx -1

-1

10

-2

10

-3

10

-4

10

log-log

p( x) = c

-x

x … logarithmic axis
y … logarithmic axis
11/8/18

semi-log

p( x) = c - x
x … linear axis
y … logarithmic axis

Jure Leskovec, Stanford CS224W: Analysis of Networks,

11

11/8/18

Jure Leskovec, Stanford CS224W: Analysis of Networks,

12

¡

Power-law degree exponent is
typically 2 < a < 3
§ Web graph:
§ ain = 2.1, aout = 2.4 [Broder et al. 00]

§ Autonomous systems:
§ a = 2.4 [Faloutsos3, 99]

§ Actor-collaborations:
§ a = 2.3 [Barabasi-Albert 00]

§ Citations to papers:
§ a » 3 [Redner 98]

§ Online social networks:
§ a » 2 [Leskovec et al. 07]
11/8/18

Jure Leskovec, Stanford CS224W: Analysis of Networks,

13

¡

Definition:
Networks with a power-law tail in
their degree distribution are called
“scale-free networks”

¡

Where does the name scale-free come from?
§ Scale invariance: There is no characteristic scale
§ Scale invariance means laws do not change if scales of length,
energy, or other variables, are multiplied by a common factor

§ Scale-free function: ! "# = "% !(#)
§ Power-law function: ! "# = "% #% = "% !(#)
Log() or Exp() are not scale free!

( )* = log )* = log ) + log * = log ) + ( *
( )* = exp )* = exp * 2 = ( * 2
11/8/18

Jure Leskovec, Stanford CS224W: Analysis of Networks,

14

[Clauset-Shalizi-Newman 2007]

Many other quantities follow heavy-tailed distributions
11/8/18

Jure Leskovec, Stanford CS224W: Analysis of Networks,

15

CMU grad-students at
the G20 meeting in
Pittsburgh in Sept 2009
11/8/18

Jure Leskovec, Stanford CS224W: Analysis of Networks,

16

Degrees are heavily skewed:
Distribution !(# > %) is heavy tailed if:
- .>*
'()
=∞
01*
*→,
/
¡ Note:
¡

§ Normal PDF: 4 % =

5

9
678

:;< =
0
=>=

§ Exponential PDF: 4 % = ?9 0@A
§ then ! # > % = 1 − !(# ≤ %) = 9 0@A

are not heavy tailed!
11/8/18

Jure Leskovec, Stanford CS224W: Analysis of Networks,

18

[Clauset-Shalizi-Newman 2007]

¡

xm

What is the normalizing constant?
p(x) = Z x-a
Z=?
§ !(#) is a distribution: ∫ ! # &# = (
Continuous approximation

§1=
§
§

,
∫* +
4

. /. =

, 12
0 ∫* . /.
+

=−

. 1265 ,
*+ =
215
215
⇒ 0 = : − 1 .8

−

4
215

∞512 − .8 512

;−( #
! # =
#< #<
11/8/18

p(x) diverges as x®0
so xm is the minimum
value of the power-law
distribution x Ỵ [xm, ∞]

Need: a > 1 !

1;

Jure Leskovec, Stanford CS224W: Analysis of Networks,

Integral:

= >#

?

># ?6(
=
>(? + ()
19

[Clauset-Shalizi-Newman 2007]

¡

¡
¡

What’s the expected value of a power-law
random variable X?
!" =
=

0

'
∫% (
&

( 1,- '
%&

1,-

) ( *( =
=

' ,-./
+ ∫% (
*(
&

234
-,/ %&
[∞1,,(-,1)

1,− (:
]

?−@
⇒= > =
BC
?−A
11/8/18

Jure Leskovec, Stanford CS224W: Analysis of Networks,

Need: a > 2 !

Power-law density:
D − 1 ( ,) ( =
(: (:

D−1
+ = /,(:
20

¡

Power-laws have infinite moments!

!−1
$& =
01
!−2

§ If ! ≤ 2 : $[&] = ∞
§ If ! ≤ 3 : +,-[&] = ∞

In real networks
2 < a < 3 so:
E[X] = const
Var[X] = ∞

§ Average is meaningless, as the variance is too high!

¡

11/8/18

Consequence: Sample average of n samples
from a power-law with exponent α

Jure Leskovec, Stanford CS224W: Analysis of Networks,

21

Estimating a from data:
¡ (1) Fit a line on log-log axis using least squares:
§ Solve !"# $%& ()* + − ' ()* - + /

0

'

BAD!

11/8/18

Jure Leskovec, Stanford CS224W: Analysis of Networks,

23

OK!
Estimating a from data:
¡ Plot Complementary CDF (CCDF) ! " ≥ $ .
Then the estimated & = ( + &′
where &′ is the slope of !(" ≥ $).
¡

Fact: If - $ = ! " = $ ∝ $/&
then ! " ≥ $ ∝ $/(&/()
§0 1≥2 =
§=

11/8/18

A

∑7
456 8(9)

− B/> =B/> 7
6

A

≈

7
∫6 <

= B/> 2 /

= /> ?= =

>/B

Jure Leskovec, Stanford CS224W: Analysis of Networks,

24

Estimating a from data:
¡ Use maximum likelihood approach:

OK!

§ The log-likelihood of observed data di:
§ ! " = ln ∏(' ) *'

= ∑(' ln )(*' )

§ = ∑(' ln(" − 1) − ln 01 − " ln

23
45

§ Want to find 6 that max !(6): Set
§

78 9
79

=0 ⇒

> =?+A
§ ⇒6
11/8/18

(
9<=

−

23
(
∑' ln
45
DB

∑AB CA
EF

78 9
79

=0

=0

Jure Leskovec, Stanford CS224W: Analysis of Networks,

Power-law density:
" − 1 0 <9
) 0 =
01 01

14 network robustness and preferential attachment

Tài liệu liên quan

Tài liệu bạn tìm kiếm đã sẵn sàng tải về