CS224W: Analysis of Networks
Jure Leskovec, Stanford University
¡
(1) Power-laws in Networks
¡
(2) Network Robustness
¡
(3) Preferential Attachment
11/8/18
Jure Leskovec, Stanford CS224W: Analysis of Networks,
2
Which interesting graph
properties do we observe
that need explaining?
¡ Small-world model:
§ Diameter
§ Clustering coefficient
¡
Node degree distribution
§ What fraction of nodes has degree ! (as a function of !)?
§ Prediction from simple random graph models:
p(!) = exponential function of !
§ Observation: Often a power-law: & ! ∝ !()
11/8/18
Jure Leskovec, Stanford CS224W: Analysis of Networks,
3
Expected based on Gnp
Found in data
! " ∝ "$%
11/8/18
Jure Leskovec, Stanford CS224W: Analysis of Networks,
4
[Leskovec et al. KDD ‘08]
Take a network, plot a histogram of !(#) vs. #
Probability: &(%) = -(. = %)
¡
11/8/18
Plot: fraction of nodes
with degree %:
| )|*+ = % |
&(%) =
,
Flickr social
network
n= 584,207,
m=3,555,115
Jure Leskovec, Stanford CS224W: Analysis of Networks,
5
[Leskovec et al. KDD ‘08]
¡
Plot the same data on log-log scale:
Probability: :(#) = !(; = #)
! # ∝ # *<.=>
Slope = −5 = 1.75
Flickr social
network
n= 584,207,
m=3,555,115
11/8/18
Jure Leskovec, Stanford CS224W: Analysis of Networks,
How to distinguish:
!(#) ∝ exp(−#) vs.
!(#) ∝ # *+ ?
Take logarithms:
if , = .(/) = 0 *1 then
log , = −/
If , = / *+ then
log , = −5 log(/)
So on log-log axis
power-law looks like
a straight line of
slope −5 !
6
¡
First observed in Internet Autonomous Systems
[Faloutsos, Faloutsos and Faloutsos, 1999]
Internet domain topology
11/8/18
Jure Leskovec, Stanford CS224W: Analysis of Networks,
7
¡
11/8/18
The World Wide Web [Broder et al., 2000]
Jure Leskovec, Stanford CS224W: Analysis of Networks,
8
¡
Other Networks [Barabasi-Albert, 1999]
Actor collaborations
11/8/18
Web graph
Jure Leskovec, Stanford CS224W: Analysis of Networks,
Power-grid
9
p(x)
1
0.6
p ( x) = cx -0.5
p ( x) = cx -1
0.2
p( x) = c - x
¡
11/8/18
20
40
x
60
80
100
Above a certain ! value, the power law is
always higher than the exponential!
Jure Leskovec, Stanford CS224W: Analysis of Networks,
10
[Clauset-Shalizi-Newman 2007]
Power-law vs. Exponential
on log-log and semi-log (log-lin) scales
¡
0
0
10
1
10
2
3
10
10
1
2
3
4
5
6
10
p ( x) = cx -0.5
p ( x) = cx -0.5
p ( x) = cx -1
p ( x) = cx -1
-1
10
-2
10
-3
10
-4
10
log-log
p( x) = c
-x
x … logarithmic axis
y … logarithmic axis
11/8/18
semi-log
p( x) = c - x
x … linear axis
y … logarithmic axis
Jure Leskovec, Stanford CS224W: Analysis of Networks,
11
11/8/18
Jure Leskovec, Stanford CS224W: Analysis of Networks,
12
¡
Power-law degree exponent is
typically 2 < a < 3
§ Web graph:
§ ain = 2.1, aout = 2.4 [Broder et al. 00]
§ Autonomous systems:
§ a = 2.4 [Faloutsos3, 99]
§ Actor-collaborations:
§ a = 2.3 [Barabasi-Albert 00]
§ Citations to papers:
§ a » 3 [Redner 98]
§ Online social networks:
§ a » 2 [Leskovec et al. 07]
11/8/18
Jure Leskovec, Stanford CS224W: Analysis of Networks,
13
¡
Definition:
Networks with a power-law tail in
their degree distribution are called
“scale-free networks”
¡
Where does the name scale-free come from?
§ Scale invariance: There is no characteristic scale
§ Scale invariance means laws do not change if scales of length,
energy, or other variables, are multiplied by a common factor
§ Scale-free function: ! "# = "% !(#)
§ Power-law function: ! "# = "% #% = "% !(#)
Log() or Exp() are not scale free!
( )* = log )* = log ) + log * = log ) + ( *
( )* = exp )* = exp * 2 = ( * 2
11/8/18
Jure Leskovec, Stanford CS224W: Analysis of Networks,
14
[Clauset-Shalizi-Newman 2007]
Many other quantities follow heavy-tailed distributions
11/8/18
Jure Leskovec, Stanford CS224W: Analysis of Networks,
15
CMU grad-students at
the G20 meeting in
Pittsburgh in Sept 2009
11/8/18
Jure Leskovec, Stanford CS224W: Analysis of Networks,
16
Degrees are heavily skewed:
Distribution !(# > %) is heavy tailed if:
- .>*
'()
=∞
01*
*→,
/
¡ Note:
¡
§ Normal PDF: 4 % =
5
9
678
:;< =
0
=>=
§ Exponential PDF: 4 % = ?9 0@A
§ then ! # > % = 1 − !(# ≤ %) = 9 0@A
are not heavy tailed!
11/8/18
Jure Leskovec, Stanford CS224W: Analysis of Networks,
18
[Clauset-Shalizi-Newman 2007]
¡
xm
What is the normalizing constant?
p(x) = Z x-a
Z=?
§ !(#) is a distribution: ∫ ! # = (
Continuous approximation
§1=
§
§
,
∫* +
4
. /. =
, 12
0 ∫* . /.
+
=−
. 1265 ,
*+ =
215
215
⇒ 0 = : − 1 .8
−
4
215
∞512 − .8 512
;−( #
! # =
#< #<
11/8/18
p(x) diverges as x®0
so xm is the minimum
value of the power-law
distribution x Ỵ [xm, ∞]
Need: a > 1 !
1;
Jure Leskovec, Stanford CS224W: Analysis of Networks,
Integral:
= >#
?
># ?6(
=
>(? + ()
19
[Clauset-Shalizi-Newman 2007]
¡
¡
¡
What’s the expected value of a power-law
random variable X?
!" =
=
0
'
∫% (
&
( 1,- '
%&
1,-
) ( *( =
=
' ,-./
+ ∫% (
*(
&
234
-,/ %&
[∞1,,(-,1)
1,− (:
]
?−@
⇒= > =
BC
?−A
11/8/18
Jure Leskovec, Stanford CS224W: Analysis of Networks,
Need: a > 2 !
Power-law density:
D − 1 ( ,) ( =
(: (:
D−1
+ = /,(:
20
¡
Power-laws have infinite moments!
!−1
$& =
01
!−2
§ If ! ≤ 2 : $[&] = ∞
§ If ! ≤ 3 : +,-[&] = ∞
In real networks
2 < a < 3 so:
E[X] = const
Var[X] = ∞
§ Average is meaningless, as the variance is too high!
¡
11/8/18
Consequence: Sample average of n samples
from a power-law with exponent α
Jure Leskovec, Stanford CS224W: Analysis of Networks,
21
Estimating a from data:
¡ (1) Fit a line on log-log axis using least squares:
§ Solve !"# $%& ()* + − ' ()* - + /
0
'
BAD!
11/8/18
Jure Leskovec, Stanford CS224W: Analysis of Networks,
23
OK!
Estimating a from data:
¡ Plot Complementary CDF (CCDF) ! " ≥ $ .
Then the estimated & = ( + &′
where &′ is the slope of !(" ≥ $).
¡
Fact: If - $ = ! " = $ ∝ $/&
then ! " ≥ $ ∝ $/(&/()
§0 1≥2 =
§=
11/8/18
A
∑7
456 8(9)
− B/> =B/> 7
6
A
≈
7
∫6 <
= B/> 2 /
= /> ?= =
>/B
Jure Leskovec, Stanford CS224W: Analysis of Networks,
24
Estimating a from data:
¡ Use maximum likelihood approach:
OK!
§ The log-likelihood of observed data di:
§ ! " = ln ∏(' ) *'
= ∑(' ln )(*' )
§ = ∑(' ln(" − 1) − ln 01 − " ln
23
45
§ Want to find 6 that max !(6): Set
§
78 9
79
=0 ⇒
> =?+A
§ ⇒6
11/8/18
(
9<=
−
23
(
∑' ln
45
DB
∑AB CA
EF
78 9
79
=0
=0
Jure Leskovec, Stanford CS224W: Analysis of Networks,
Power-law density:
" − 1 0 <9
) 0 =
01 01
25