Tải bản đầy đủ (.pdf) (6 trang)

10 - email network analysis for organizational management

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (542.9 KB, 6 trang )

Abstract - In this turbulent business environment of
global recession, traditional organizational structure is
reaching its limits. In order to accommodate itself to these
changes, managing informal communication beyond old
framework is indispensable. It is critical for innovation
management to recognize communities of practice and
informal leaders. In previous studies we have demonstrated
our method was effective to indentify informal communities
and potential leaders from one month email log data
collected in September 2008 within an organization through
a case study of a global manufacturing company. In this
paper we collect the second set of one-month email log in
June 2009 so as to chronologically compare with the first set
of data collected in September 2008 and to analyze changes
before and after major organizational changes triggered by
the bankruptcy of Lehman Brothers. Email network analysis
helps management systematically view its organization as a
whole.


Keywords
- email, network analysis, organizational
management, leadership, innovation


I. INTRODUCTION

In this turbulent business environment, traditional
organizational structure is reaching its limits. By
accommodating itself to these changes for its survival and
prosperity, business organizations need to manage


communication networks beyond old framework. As for
innovation management, it is indispensable to identify
communities of practice and deploy informal leaders.
Through a case study of a global manufacturing
company, in our previous studies we have demonstrated
our method was effective to indentify informal
communities and potential leaders with the network
analysis of the first set of one-month email log data
collected in September 2008 within the organization [1].
As the results of the previous case study with
interviews, we identified communities and hierarchical
structures reflect actual status of organization structures of
the organization. Most of people who have high network
centralities are recognized as key persons in the firm. We
found that both betweenness and pagerank is a good
indicator to detect hidden leadership in their communities.
In this paper, we collect the second set of one-month
email log data in June 2009 and chronologically compare
and analyze any changes. We use the same methodology
of the previous studies for the email network analysis in
which we construct an email network from a set of log
data, and then identify communities in the email network
by performing a topological clustering of the networks.
We calculate degree centrality, betweenness centrality,
closeness centrality, and pagerank centrality. Clustering
process is visualized by a dendrogram which is a
hierarchical tree diagram. Then, we interview the
managers of the company.
Our data are unique in three ways. (1) The email log
of a fairly large size organization is collected. (2) Two

sets of data are collected for chronological analysis. (3)
The collection of data sets coincides with the drastic
organizational change owing to the unprecedented
business impact triggered by the bankruptcy of Lehman
Brothers in September 2008. Consequently, we have the
data sets for organizational analysis before and after the
impact of global recession from a perspective of informal
community by an email network analysis.
According to the interview with the managers of the
company, the top management team resolutely carried out
organizational changes for its survival through the global
depression, aiming for (1) restructuring of highly paid
managers, (2) rejuvenations for organizational vitality,
and (3) reintegration of divisions for innovation. We
challenge to evaluate the organizational changes for
verification with the email network analysis. As well as
informal community analysis, we compare before and
after leader characteristics with network centralities and
communication patterns.
The informal networks coexist with the formal
structure of the organization and serve many purposes,
such as resolving the conflicting goals of the institution to
which they belong, solving problems in more efficient
ways [2], and furthering the interests of their members.
Despite their lack of official recognition, informal
networks can provide effective ways of learning and with
the proper incentives actually enhance the productivity of
the formal organization [3, 4]. Along with the growth of
the informal communities, leadership roles in the
communities have been distributed [5]. Given the

dynamics of forming communities and distributed
leadership, it is important to extract such hidden patterns
of collaboration and leadership for organizational
management that could lead to innovation.
The previous approach to identify informal
community was to gather data from interviews, surveys,
or other fieldwork and to construct links and communities
by manual inspection [6] or an internet-centric approach
[7]. These methods are accurate but time-consuming and
Email Network Analysis for Organizational Management


H. Tashiro
1
, J. Mori
1
, N. Fujii
2
, and K. Matsushima
1
1
Graduate School of Engineering, the University of Tokyo, Tokyo, Japan
2
Faculty of Science and Engineering, Waseda University, Tokyo, Japan
{jmori, tashiro}@ipr-ctr.t.u-tokyo.ac.jp



958
978-1-4244-6567-5/10/$26.00 ©2010 IEEE


labor-intensive, prohibitively so in the context of a very
large organization. Given the recent development of
online communications in an organization, several studies
have been working on identification of communities using
online information resources [8]. Adamic showed that the
communities, identified from online mailing lists and
Web, resemble the actual social communities of the
represented individuals [9].
Among several communication means, email has
widely become the means of communication in an
organization. Therefore email has been established as an
indicator of collaboration and knowledge exchange [8, 10,
11]. Since email provides plentiful data on personal
communication in an electronic form which enables
automatic processing of data, several studies have
addressed using email to discover shared interests,
relationships, and social networks [12, 13]. Providing the
structure and communication patterns within an
organization [14, 15], email networks are useful
information resources to find informal communities.
Several studies have proposed automated methods for
using email data to construct a network, and then identify
informal communities within an organization [16, 17].
However, there is not yet enough understanding and
evaluation regarding how identified communities from
email data can be exploited for management of
organization and leadership which is important to enhance
organizational innovation.
In this paper, we collect and analyze the second set of

the one-month email log data with the method for
indentifying informal communities and potential leaders.
We use the clustering method that can rapidly detect
dense communities within an email network. The result of
the clustering process reveals informal communities and
hierarchical structures with an organization. To
characterize people in the informal communities, we
calculate several network centralities of a person using the
structure of an email network. Through the interviews
with the managers, these measures enable us to identify
leadership roles with the informal communities. Then, we
compare two sets of email communication networks to
see if we can conclude any managerial implications with
significance to the top management.


II. METHODOLOGY

A. Email network
We construct an email network from email log data.
We extract the information about sender and receiver
from each email. The sender or receiver corresponds to a
node in the network. If there is at least one email
communication between persons, an edge is then drawn
between these persons. As a sum of the nodes and edges,
we finally obtain the email network. Since we distinguish
a sender from a receiver, an email network is expressed as
a directed graph. Given the network, we find a maximal
complete sub-graph as a clique which becomes a target of
following network analysis.


B. Email network analysis
We first identify communities in the email network.
To this aim, we perform a topological clustering of
networks. Although such a methodology had been
difficult to achieve due to the difficulty in performing
cluster analysis of non-weighted graphs consisting of the
large number of nodes, recently proposed algorithms [18,
19] facilitate fast clustering with calculation time in the
order of O((l+n)n), or O(n
2
) on a sparse network with l
links; hence this could be applied to large-scale networks.
The algorithm proposed was based on the idea of
modularity. Modularity Q was defined as follows [18, 19,
20]:

¦

»
»
¼
º
«
«
¬
ª
¸
¹
·

¨
©
§

m
N
s
ss
l
d
l
l
Q
1
2
2

where N
m
is the number of modules, l
s
is the number of
links between nodes in module s, and d
s
is the sum of the
degrees of the nodes in module s. In other words, Q is the
fraction of links that fall within modules, minus the
expected value of the same quantity if the links fall at
random without regard for the modular structure.
A good partition of a network into cluster must

comprise many within-cluster links and as few as possible
between-cluster links. The objective of a community
identification algorithm is to find the partition with the
largest modularity. The algorithm to optimize Q over all
possible divisions is as follows. Starting with a state in
which each node is the only member of one of n clusters,
we repeatedly join clusters together in pairs, choosing the
join which results in the greatest increase in Q at each step.
Since a high value of Q represents a good cluster division,
we stopped joining when
Ǽ
Q became minus. At the
maximal value of Q, Q
max
, we obtain a cluster structure of
a network with effective division. The clusters correspond
to the informal communities in the email network. The
cluster label can be assigned by examining characteristics
of node attributes.
A node in a cluster is characterized with its network
centralities [21]. We calculate centralities as follows.

 Degree centrality: the number of links of a node.
 Betweenness centrality: the number of node pairs
that pass through a node.
 Closeness centrality: average shortest path to
other nodes.
 Pagerank centrality: the stationary distribution of
the Markov chain corresponding to the stochastic
transition matrix of a network.


Assuming that leadership is influenced by
communication and trust on one’s social network [5],
leadership roles are characterized with these centralities.

C. Email network visualization


To visualize the large-scale network, we employ the
force-directed GEM layout [22]. GEM optimizes minimal
959
Proceedings of the 2010 IEEE ICMIT
node distances and constant edge lengths and in turn
visualizes a network as a circle. This layout helps give an
overview of identified clusters in a network.
Clustering process is visualized by a dendrogram
which is a tree diagram frequently used to illustrate the
arrangement of the clusters produced by hierarchical
clustering. A dendrogram helps show hierarchical
structure among clusters and therefore understand how
identified communities are related each other.


III. RESULTS

We applied our method to actual email data from one
firm. We collected two sets of one-month email log in
September 2008 (data1) and in June 2009 (data2) in order
to chronologically compare and analyze any changes. The
data1 includes emails of 2,882 employees and the data2

includes emails of 2,459 employees. For reasons of
privacy and complexity, we only used emails that had an
internal origin and destination within the firm.
Table I shows properties of a network from the data1.
Each node has 51.77 links on average and the whole
network showed power law in degree (see Fig. 1.). It also
has “small-world” properties where clustering coefficients
are much larger than the ones of random network (0.387 /
0.01) and the path length (2.67/ 2.74) is close to the one of
random network (see TABLE I).
Table II shows properties of a network from the data2.
Each node has 36.151 links on average and the whole
network showed power law in degree (see Fig. 2.). It also
has “small-world” properties where clustering coefficients
are much larger than the ones of random network (0.377 /
0.01) and the path length (2.72/ 2.76) is close to the one of
random network (see TABLE II).
We applied our algorithm as described above to
identify the communities within the network. We obtained
seven distinct clusters from the data1 as shown in Fig. 3.
We also identified the hierarchical structure among
communities from the data1 as shown in Fig. 5. From the
data2, we obtained four distinct clusters as shown in Fig.
3. Consequently, we indentified the hierarchical
structure among communities from the data1 as shown in
Fig. 6.
We manually checked division that each employee in
a cluster belongs. We found that each cluster nearly
corresponded to one or combination of some divisions in
the firm. We showed the results to people from the firm

and conducted interviews. They agreed with that both
identified communities and hierarchical structure reflect
actual status of organization structures of the firm. They
pointed out that some identified clusters fit informal
communities that play important roles in the organization
management.
We also showed people who have high network
centralities in a community. They recognized most of
people who have high network centralities as key persons
in the firm. However, they also find some people who
they did not expect have high network centralities. In fact,
the further interviews reveal that such people have
potential leadership for the organization management. In
particular, we found that both betweenness and pagerank
is a good indicator to detect such hidden leadership
among the centralities.


Fig. 1. Degree distribution of the email network
(2008.09).














Fig. 2. Degree distribution of the email network
(2009.06).




TABLE II
PROPERTIES
OF THE EMAIL NETWORK (2009.06)

n average k C L
2,459 36.151 0.377
(0.010)
2.72
(2.76)
n: number of nodes, k: number of links
C: Clustering coefficient, L: Average path length
TABLE I
PROPERTIES
OF THE EMAIL NETWORK (2008.09)

n average k C L
2,882 51.77 0.387
(0.010)
2.67
(2.74)
n: number of nodes, k: number of links

C: Clustering coefficient, L: Average path length
960
Proceedings of the 2010 IEEE ICMIT

Fig. 3. Clusters of the email network (2008.09).



Fig. 4. Clusters of the email network (2009.06).



Fig. 5. Dendrogram of Clusters of the email network
(2008.09).





Fig. 6. Dendrogram of Clusters of the email network
(2009.06).
IV. DISCUSSION

A. Small World

The email communication network maintains the
properties of scale-free and “small-world” network. The
number of nodes, or senders and receivers of emails, has
decreased drastically by 14.7% from 2,882 to 2,459,
comparing with the previous period in September 2008.

The number of edges, or email communication links
between nodes, has decreased by 40.4% from 74,601 to
44,448. The average degree, or average number of people
the nodes communicate with email, has also decreased by
30.2% from 52 to 36. Clustering coefficient, the tendency
to group together, has decreased by 2.5% while average
path length has increased by 2.1%.
The results indicate the facts that there were drastic
reduction of email users and changes in email behavioral
pattern among employees. The number of email
communication in the organization has been reduced. The
scope of communication rather focused than the previous
period.
The interview with the managers revealed the
company offered a voluntary early retirement program for
highly paid seniors and managers to improve the
company’s income statement. Consequently, the
organization was slimed down and restructured. The
concurrent reduction of both overtime and number of
workers gave the employees time pressures to reduce
issuances of emails. In the past the seniors and the
managers retired early had to be included in the
communication network. The early retirement of those
people influenced the reduction of direct emails as well as
carbon copies for red tapes.
The top management realized its intention for higher
productivities by reducing inputs of the management
resources even though the sales have radically decreased
during the global recession. The analysis shows the
organization as a whole accommodate the challenge of

time constraint with the radical reduction of email time
along with preparation of attachments as one of the
means.

B. Formal

and

Informal

Organization

The email network with clusters represents one aspect
of the organizational reality. The number of clusters has
decreased from 7 to 4. The previous cluster C was merged
with A, forming a cluster of 770 nodes. The previous D
remains as the smallest cluster D of 19 nodes, and the
previous cluster B remains as the present B of 265 nodes.
The previous clusters E and G were merged with the
cluster F, forming the largest present cluster F of 1,405
nodes.
According to the dendrogram analysis, the previous
clusters A and B had stronger tie with each other.
However, after the major organizational restructuring, the
clusters B and F are closer now. As the cluster D supplies
parts to the cluster A, they remains close relationship.
961
Proceedings of the 2010 IEEE ICMIT

In the previous section, we observed the productivity

increase of the new organization after the major change.
On the other hand, we can deduce the decrease of the
emails with lower priority, taken over the necessary work-
related emails. The majority of the informal layer of
communities was removed from the email communication
network. With these assumptions, the current
communication network with clusters represents rather
job-related communication network.
According to the interview, the top management
aimed the integration of headquarters with business
divisions. The largest cluster F demonstrates the
integration of headquarters functions and one business
division as well as its business branches. Physical
locations and peer human relationships became less
significant than work relationship in the dendrogram. That
is evidenced by the merger of the cluster E and the cluster
G with the cluster F.
Although the physical locations of the clusters A and
B are close, on business basis, the cluster B is now closer
to the cluster F. However, the independence of the cluster
A was emphasized as a self-sufficient organization. The
phenomenal observation of the changes in the clusters is
meaningful for the evaluation of the top management’s
intention of the organizational change and reshuffle of
managers.
The email network analysis with communication
network with clusters provides the top management with
rather objective pictures of before and after the
organizational changes as well as its environmental
changes. This is a powerful feedback for the top

management team to evaluate the organizational status
and performance of their strategies. As changes become
faster and stronger in magnitude of turbulence in business
circumstances, quick feedback and chronological database
surely assist the organizational leaders for effective
management.

C. Individual Centralities

None of the top 30 employees in the previous
pagerank or betweenness lists was ranked within the
current top 30 this time. In other words, the people with
high scores in the communication importance and bridge
were replaced with the new groups. On the other hand,
according to the interview, the pagerank and betweenness
were still indicators of potential leaders. The
communication structures were dynamically changed
through the major restructuring.
The managers told us in the interview that a year ago,
the degree centrality of administrative assistants, office
clerks, and people who had established their own informal
networks over long periods of their career in the
organization was higher. However, the analysis of data2
showed that new divisional managers’ degrees were
higher. The returned overseas expatriates and young
analytic engineers were with higher scores than before.
Owing to the early retirements of seniors in their 50’s,
the new organizational communication network was
shifted toward the healthy directions as the top
management intended.

The innovative leaders have created the environment
of the knowledge interactions through communication
among their members and the ecosystem of knowledge
creation. As the cores of the communication network
clusters, they have managed the effective communication
through their strong visions of the organizational success.
The visualization of such leaders and their communication
patterns as the managers of successful teams has helped
the top management design and implement its strategy for
the innovation management.

D. Future Study

In near future, we plan to narrow our focus
organization down to engineering groups and their
interactions with the entire organization. Recently, for
innovative stimulus the top management engineered the
system of an internal engineering community program
within the engineering organization. An engineering
community is engineers’ group of one technological
element across the organization. The members of cross-
divisional communities are from novices to experienced
specialists. The top management needs to evaluate the
activities of leaders and their communities. We are going
to analyze the email communication networks with
clusters and network centralities of the engineering
communities. We also plan to compare the innovative
activities before and after the engineering community
program introductions.



V. CONCLUSION

We observed the chronological phenomena of
managerial decisions of organizational changes with
email log data by network analysis. Characteristics
changes of communities are clearly curved in relief with
cluster analysis. Leadership roles have not changed
between before and after analysis while leaders reduced
their influences as bridges among communities.
Our method helps management systematically view
its organization as a whole by using email network
analysis. The email network analysis can be used to
evaluate communication of interactions among the
members. It also helps identify candidates of leaders
acting as a hub of information channel of the
communication network.
Formal organization would be evaluated with
informal communities before and after major
organizational changes. There are traditional interviews
and questionnaires to capture a state of organization.
Email network analysis provides with one more
significant, objective, and analytical tool in a manager’s
tool box.

962
Proceedings of the 2010 IEEE ICMIT

REFERENCES


[1] J. Mori, H. Tashiro, K. Haraoka, and K. Matsushima,
“Identifying Informal Communities and Leaders for
Total Quality Management using Network Analysis
of Email,” In Proc. of the International Conference on
Industrial Engineering and Engineering Management
(IEEM), 2009.
[2] B. A. Huberman and T. Hogg, “Communities of
Practice: Performance and Evolution”.
Computational and Mathematical Organization
Theory, Vol. 1, pp. 73-92, 1995.
[3] D. Crane, Invisible Colleges: Diffusion of Knowledge
in Scientific Communities. University of Chicago
Press, Chicago, 1972
[4] J. Lave and E. Wenger, Situated Learning: Legitimate
Peripheral Participation. Cambridge University Press,
1991.
[5] Chen, J. Li, and H. Wang, “Structure and Dynamics
of Distributed Leadership in the Perspective of Social
Network Analysis,” In Proc. of the International
Conference on Industrial Engineering and
Engineering Management (IEEM), 2008.
[6] T. Allen, Managing the Flow of Technology. MIT
Press, 1984.
[7] L. Garton, C. Haythornthwaite, and B. Wellman,
“Studying online social networks,” Journal of
Computer-Mediated Communication, Vol. 3, No.
1, 1997.
[8] B. Wellman, “Computer Networks As Social
Networks,” Science Vol. 293, No. 14, 2001.
[9] L. A. Adamic, and E. Adar, “Friends and Neighbors

on the Web,” Journal of Social Networks, Vol. 25,
No. 3, 2002.
[10] S. Whittaker and C. Sidner, “Email Overload:
Exploring Personal Information Management of
Email”, in Proc. of CHI ’96, pp. 276-283, 1996.
[11] N. Ducheneaut and V. Bellotti, “A Study of Email
Work Processes in Three Organizations,” Journal of
CSCW, 2002.
[12] M. F. Schwartz and D. C. M. Wood, “Discovering
Shared Interests Among People Using Graph
Analysis”, Communications of the ACM, volume 36,
issue 8, pp. 78-89, 1992.
[13] A. Culotta, R. Bekkerman and A. McCallum,
“Extracting social networks and contact information
from email and the Web,” First Conference on Email
and Anti-Spam, 2004.
[14] R. S. Burt, “Models of Network Structure”, Annual
Review of Sociology, Vol. 6, pp. 79-141, 1980.
[15] W. R. Scott, Organizations: Rational, Natural, and
Open Systems. Prentice-Hall, Inc., Englewood Cliffs,
New Jersey, 1992.
[16] J. R. Tyler, D. M. Wilkinson, and B. A. Huberman,
"Email as spectroscopy: Automated discovery of
community structure within organizations," The
Information Society, Vol. 21, No. 2, pp. 143-153,
2005.
[17] J. Diesner and K. M. Carley, “Exploration of
communication networks from the Enron email
corpus,” Proceedings of Workshop on Link Analysis,
Counterterrorism and Security, SIAM International

Conference on Data Mining, pp. 3-14, 2005.
[18] M. E. J. Newman, “Fast algorithm for detecting
community structure in networks,”
Physical Review
E, vol. 69, art no. 066133, 2004.
[19] M. E. J. Newman and M. Girvan, “Finding and
evaluating community structure in networks,”
Physical Review E, vol. 69, art no. 026113, 2004.
[20] R. Guimerà, M. Sales–Pardo, and L. A. N. Amaral,
“Modularity from fluctuations in random graphs and
complex networks,” Physical Review E, vol. 70, art
no. 025101, 2004.
[21] L. C. Freeman, "Centrality in Social Networks:
Conceptual Clarification," Social Networks, Vol.1,
pp.215-239, 1978.
[22] A. Frick, A. Ludwig, and H. Mehldau. A fast
adaptive layout algorithm for undirected graphs. In R.
Tamassia and I. G. Tollis, editors, Graph Drawing
(Proc. GD ’94), volume 894 of Lecture Notes
Comput. Sci., pages 388–403.Springer-Verlag, 1995
963
Proceedings of the 2010 IEEE ICMIT

×