Tải bản đầy đủ (.pdf) (12 trang)

Resource Information Retrieval Using SENS A Scalable and Expressive Naming System

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (262.6 KB, 12 trang )

Resource Information Retrieval Using SENS A Scalable and Expressive Naming System
Hoaison NGUYEN1 , Hiroyuki MORIKAWA2 and Tomonori AOYAMA3
1

College of Technology, Hanoi University
School of Frontier Sciences, The University of Tokyo
Research Institute for Digital Media and Content, Keio University
2

3

Abstract. We design a scalable and expressive naming system called
SENS, which can retrieve information of computing and content resources
distributed widely on the Internet by exact queries and multi-attribute
range queries over resource names. Our system utilizes a descriptive naming scheme to name resources and a multi-dimensional resource ID space
for message routing through the overlay network of name servers (NSs).
The resource ID space is constructed on the overlay network based on CAN
routing algorithm. We propose a novel mapping scheme between resource
names and resource IDs, which can preserve the locality of resource IDs
while still achieving a good degree of load balancing regarding resource
information distribution. We also propose a multicast routing algorithm to
deliver resource information and a broadcast routing algorithm to route
query messages to corresponding NSs at small cost. Our simulation results show that our system can achieve good routing performance and load
balancing.

1

Introduction

At present, to provide autonomous information systems such as web services, ubiquitous computing systems[1] or Grid computing systems [2] with resource information, the general approach is to publish the description of resources to a directory
service. MDS-2 provides directory services for Grid computing systems. It uses


LDAP [4], a hierarchical directory service, as a uniform interface for accessing and
managing information about the status of Grid computing resources. For Web
services, Universal Description, Discovery and Integration (UDDI) [3] is used to
discover services. A service provider describes its service using WSDL and publish
the description to UDDI directory service. Service consumers will ask UDDI directory service for requested services. However, conventional directory services can
only provide exact query function, but not support rich query functions such as
range query. Since the evolution of the Internet now brings us the ability to access
huge number of ubiquitous computing resources, we consider that a scalable and
expressive information retrieval services is essential for autonomous information
systems.
We design a scalable and expressive naming system called SENS to provide a
resource information retrieval service based on resource names. In our system, a
descriptive naming scheme that names each resource by a tuple of attribute/value
pairs is used. Resource information is stored at a large number of name servers
(NSs). Our system retrieves resource information by exact queries (i.e. query information of a resource whose resource name is the same as the query name)


and multiple-attribute range queries (i.e. querying information of resources whose
names have attribute values satisfying a query range). It routes query messages to
NSs that are responsible for queried resource names.
Our challenge is to design a message routing protocol on the overlay network of
NSs to achieve scalable and efficient resource information distribution and query.
Our SENS system constructs a high-dimensional resource ID space (i.e. DHT key
space) on the overlay network of NSs and map resource names to the resource ID
space such that the locality of resource names in the resource ID space and the load
balancing are both maintained. Message routing for resource information queries is
performed based on the locality of resource names. We propose a multicast routing
algorithm to deliver resource information and a broadcast routing algorithm to
route query messages to corresponding NSs at small cost. Our simulation results
show that our system can achieve good routing performance and load balancing.

The rest of this paper is structured as follows. We present the background of
our research in Section 2, the design of our system in Section 3. Section 4 describes
our simulation and the results. Section 5 discusses the application of our SENS
system and Section 6 presents conclusions and future works.

2
2.1

Background
Related works

Several works such as INS[7] or ENS [8] have challenged an expressive naming system with the approach of routing messages based on resource names. However, this
approach is not scalable since the size of routing table will become unacceptably
large when the number of resources increases.
Recently, Distributed Hashing Tables (DHTs) such as CAN[9], Chord[10], etc.
attract lots of attention since they can offer a promising solution for scalable
message routing on overlay networks. However, DHTs realize range queries with
very large overhead because their consistent hashing function maps resource names
in a range of attribute values to a large number of DHT keys.
To resolve this problem, a locality preserving hash function is utilized to map
each attribute/numerical value pairs of a resource name to a DHT key [12–14].
The resource information location and query resolution is performed based on
DHT keys mapped by attribute values corresponding to one attribute in the query
name. However, this approach does not scale well because the distribution of attribute/value pairs in resource names is often skewed. NSs responsible for popular
attribute/value pairs will suffer from heavy load of resource information registrations and queries.
2.2

CAN routing

We utilize CAN routing [9] as a message routing algorithm on the overlay network

of NSs. The reason of using CAN is that in addition to the advantages of a DHT
routing algorithm, CAN routing can construct a d-dimensional resource ID space
on the overlay network of NSs. CAN routing is performed as followed.
The resource ID space is partitioned into hyper-rectangles, called zones. Each
NS is responsible for a zone. When a new NS joins the overlay network, it will
choose an initial point Pi in the resource ID space and send a request to an
existing NS responsible for the zone within which the point Pi lies. The existing
NS will split in half, retaining half and handing the other half to the new NS.


Destination node

Sending node

Routing table

Fig. 1. A sample of 2-dimensional CAN routing

The NSs maintain a coordinate routing table that holds the address and the
zone of its neighbor NSs. Two NSs are neighbors if their zone overlaps along d − 1
dimensions and abut along one dimension. Using its coordinate routing table, a
NS routes a message towards the NS responsible for the destination ID by simple
greedy forwarding to the neighbor NS whose zone is closest to the destination ID
(Fig. 1).
CAN routing algorithm can achieve good routing performance on the overlay
network of NSs. As shown in [9], in a d-dimensional resource ID space with N
NSs with d is a small number, the size of a coordinate routing table is O(d) and
the path length between two NSs is O(N 1/d ). In the case of a high-dimensional
resource ID space with d > log2 N , if the space is divided along the dimension
determined by a fixed cyclical ordering of the dimensions, the size of a coordinate

routing table and the path length between two NSs will be O(logN ) [15].

3
3.1

Design of SENS
Overview

Our system achieves expressiveness on resource naming by utilizing a descriptive
naming scheme which names a resource by a tuple of attribute/value pairs. For
example, a computer is named as: (string OS = “linux”, string CPU-name =
“Pentium 4”, int CPU-clock (Mhz) = 1000 , int memory (MB) = 1024,int harddiskunusedspace (GB ) = 20 , int network-bandwidth (Mbps) = 1000 ). The attribute
includes a data type and a name. The data type (e.g. string, integer, Boolean) will
decide a type of value that an attribute value can take. The name of attributes
expresses the semantics of (attribute/value) pairs. Each kind of resources has a set
of attributes used for naming. The number of attribute/value pairs in a resource
name may be dozens of pairs.
In the case of an exact query, a NS queries information of a resource that has
the same resource name as the queried name. In the case of a range query, a NS
queries information of resources that have resource names satisfying query ranges
of attribute values. A query range is expressed by the use of inequality operators
(>, <, ≤, ≥ ) and the disjunction operator. For example, our system can realize
a range query for computing resources expressed as: (string OS = “linux”, string
CPU-name = “Pentium 4”, int CPU-clock (Mhz) ≥ 1000 & int CPU-clock (Mhz)
≤ 1200 , int memory (MB) ≥ 512 , int harddisk-unusedspace (GB) ≥ 10 , int
network-bandwidth (Mbps) ≥ 100). If the attribute value corresponding to an
attribute in a range query is arbitrary, the wild card (∗) can be used instead.


SENS distributes resource information to the overlay network of NSs based on

resource IDs. SENS builds the resource ID space as a virtual d-dimensional Cartesian coordinate space(i.e. d-dimensional resource ID space) using CAN routing
algorithm. To limit the number of NSs responsible for a range query, we propose a
locality-preserving mapping scheme between a multi-attribute resource name space
and a multi-dimensional resource ID space. A resource ID is considered as a set of
d coordinates of a point in the d-dimensional resource ID space. A resource name is
mapped to a resource ID by mapping each attribute value in attribute/value pairs
of the resource name to a coordinate value of the resource ID in a deterministic
dimension assigned by the attribute. As a result, our matching scheme allows all
resource names that match a range query to be mapped within a limited segment of
the resource ID space (i.e. a range query segment). Furthermore, a data item such
as (string audio.input.format =“AVI”, string audio.output.format = “wav”, int
network.bandwidth (Mbps) = 10) can be found with a query which keywords are
different with the ones of the data item in their number and order, for example
(string audio.input.format =“AVI”, string audio.output.format = “wav”, string
video.input.format = *, string video.output.format = *, int network.bandwidth
(Mbps) ≥ 5).
Resource information including a resource name and meta data is stored at
NSs responsible for resource IDs. A NS performs a query by mapping a queried
resource name or a range of queried names to a queried resource ID or a range
query segment in the ID space. It then sending a query message to NSs that are
responsible for the queried resource ID or the range query segment.
In next subsections, we will describe in detail our design including the mapping
scheme between resource names and resource IDs, the construction of the resource
ID space, the distribution of resource information and the query resolution.
3.2

Mapping resource names to resource IDs

Resource name
attr 1:val 1


attr 2:val 2

attr 3:val 3

attr 4 :val4

Mapping

Resource ID
0

v2
a2 =2

0: default value

v1

0

a1 = 3
a i = Ha(attr i )

v3
a3 = 5

v4
a4 = 6


v i = Hv(val i )

attr i :val i : the i th attribute/value pair in the resource name

Fig. 2. An example of a mapping between a resource name and a resource ID

A resource name is mapped to a resource ID by assigning the hash value of
each attribute value to a coordinate in a corresponding resource ID. The dimension
order number of the assigned coordinate is the hash value of the corresponding
attribute. Here, a uniform hash function Ha hashes each attribute from 1 to d and


Resource name
attr1 :val 1

attr 2 :val 2 attr 3 :val 3

attr 4 :val 4

attr 6 :val 6

attr 5 :val 5

Mapping

Resource IDs
0

v2


v1

0

v3

0

v3

v4
v4

v4

0

v2

v5

0

v2

v1

0

v6


0

v2

v5

0

v6

v4

a 2 =2

a1 =a 5 = 3

a3 =a 6 = 5

a 4= 6

0: default value

a i = Ha(attr i )

v i = Hv(val i )

attr i :val i : the i th attribute/value pair in the resource name

Fig. 3. Mapping a resource name to multiple resource IDs in the case attributes

attr1 , attr5 are hashed to the same value a1 = a5 = 3 and attributes attr3 , attr6 are
hashed to the same value a3 = a6 = 5

a hash function Hv hashes each attribute value from 1 to 2m − 1, where m is the
maximum size of a coordinate value in bits. If there is a coordinate that no value
is assigned to, a default value (e.g. 0) is assigned instead.
For example, a resource name as shown in Fig. 2 is identified by a tuple of 4
attribute-value pairs: ((attri1 : val1 ), (attri2 : val2 ), (attri3 : val3 ), (attri4 : val4 )).
The resource name will be mapped to a resource ID of a 6-dimensional resource
ID space. Because the hash value of attri1 is a1 = Ha (attr1 ) =3, the hash value
of val1 will be assigned to the 3rd coordinate value and so on. Since no value is
assigned to the 1st and the 4th coordinate values, the default value 0 is assigned
instead.
In the case of a numerical attribute value, a locality preserving hashing function
is used as Hv . Here, the locality preserving hashing function is defined as if vi > vj
then Hv (vi ) > Hv (vj )[12]. An example of a locality preserving hashing function is
(v − vmin ) ∗ (2m − 1)
Hv (v) =
,
(vmax − vmin )
where vmax and vmin are the maximum and minimum values that the attribute
value may take. In the case of an attribute value of string type, a uniform hash
function is used as Hv .
All resource names that match a range query will be mapped into a segment
of the resource ID space, limited by the hash values of the upper and lower limit
of queried value ranges in each dimension. If a resource name matches a range
query, its attribute values will be between the upper and lower limit of the value
range corresponding to each attribute. Since attribute values associated with an
attribute will be mapped to coordinate values in the same dimension and the
mapping between an attribute value and a coordinate value is locality-reserved,

each coordinate value of the resource ID will be between the hash values of the
upper and lower limit of queried value ranges in each dimension.
Our mapping scheme is not injective. Several resource names with different attribute/value pairs may be mapped to the same resource ID. However, the resource
ID does not need to be unique since the resource information is identified by the
resource name, not the resource ID. When looking up a query resource ID, the


NS will check the resource name before returning the lookup result of the queried
resource ID.
If multiple attributes in a resource name are hashed to the same value (e.g.
Ha (attri ) = Ha (attrj ), the corresponding attribute values will be mapped to multiple coordinate values in the same dimension (i.e. a set of collided coordinate values). In this case, to preserve the locality property, the resource name is mapped
to multiple resource IDs, each of them contains a set of collided coordinate values.
For example, a resource name is identified by a tuple of six attribute-value pairs
as shown in Fig. 3 and its attributes attr1 , attr5 are hashed to the same value
Ha (attr1 ) = Ha (attr5 ) = 3 and attributes attr3 , attr6 are hashed to the same
value Ha (attr3 ) = Ha (attr6 ) = 5. In this case, since {v1 , v5 } and {v3 , v6 } are two
sets of collided coordinate values, the resource name will be mapped to four resource IDs: (v1 , v3 ),(v1 , v6 ),(v5 , v3 ),(v5 , v6 ), each of them contains two values from
each set. Resource information will be replicated and delivered to NSs that are
responsible for these resource IDs.
Large number of collided coordinate values will force a large amount of resourceIDs to be generated. The probability that a collision occurs depends on the
number of attribute/value pairs in a resource name. To limit the number of resource IDs per resource name, the number of attribute/value pairs in a resource
name should be limited to a reasonable value. If a resource has the number of
attribute/value pairs that is over the limited number, the set of attribute/value
pairs should be divided to multiple sets of attributes which correspond to multiple
resource names. The way of dividing resource names is out of the scope of this
paper.
3.3

Load balancing


The number of resource IDs distributed to a zone that a NS is responsible depends
on not only the volume of the zone but also the number of the default value
that coordinate values of a zone contain. It is because in a mapping between a
resource name and a resource ID, the default value will be assigned to a number of
coordinates values which are not mapped with any hash value of attribute values.
To keep the gap between the numbers of resource IDs stored in each NS to be
small, for each zone assigned to a NS, the number of coordinate values containing
default value and the volume of the zone should be in inverse proportion. We simply
realize this requirement by randomly assigning the default value to a number of
coordinates of each initial point Pi , which is assigned to a NS when it newly joins
the overlay network. As a result, the probability that a zone whose coordinates
contain a large number of default values is split will be high and therefore, the
volume of such a zone will be small.
3.4

Resource information distribution

A resource ID is mapped to a point P of the resource ID space and information of
the resource is delivered to the NS that owns the zone within which the point P lies.
If a resource name is mapped to multiple resource IDs, information of the resource
will be replicated at NSs responsible for these resource IDs. If a NS is responsible
for several resource IDs of the same resource, only one copy is replicated at the
NS.
To deliver information of a resource to multiple NSs responsible for resource
IDs, we propose a multicast routing algorithm based on spanning binomial tree


([2,4],[2,4],[2,4])

([0,2],[2,4],[2,4])


(0,3,3)({1})
(3,3,3)({1})
({0|1|3},{0|1},{0|3|4})
({0|1|3},{0|1},{0|3|4})

([0,2],[0,2],[2,4])

([2,4],[0,2],[2,4])
(0,0,3)({1})
({0|1|3},{0|1},
{0|3|4})
Middle NS
([0,2],[0,2],[0,2])
Agent NS

([2,4],[2,4],[0,2])
(0,3,0)({2})
({0|1|3},{0|1},
{0|3|4})

(3,3,0)({2}]
({0|1|3},{0|1},{0|3|4})

([2,4],[0,2],[0,2])
(3,0,0)({3})
({0|1|3},{0|1},{0|3|4})

(0,0,0)({3})
({0|1|3},{0|3},{0|3|4})

Message forwarding
([0,1],[0,1],[0,1]) The zone that a NS is responsible for
multicast dimension
(0,0,0)({0}) ({0|1|3},{0|3},{0|3|4})
destination ID

resource IDs

Fig. 4. Multicast routing protocol for sending information of a resource to NSs responsible
for resource IDs

(SBT) [16]. In our algorithm, only one registration message is sent to each NS
responsible for resource IDs of a resource name. Thus, our multicast routing algorithm only sends minimum amount of messages to deliver information of a resource
to corresponding NSs. The algorithm is performed as follows.
The resource IDs corresponding to a resource name will construct a hypercube
in the resource ID space. The SBT is constructed on the hypercube (Fig. 4a) and
the message is routed from the root node of the SBT to nodes in lower level.
Supposing that the resource IDs corresponding to a resource name are expressed
as ({v11 , ..., v1n1 }, ..., {vi1 , ..., vini }, ..., {vd1 , ..., vdnd }) where {vi1 , ..., vini } is the
set of coordinate values in ith dimension (i ∈ [1, d]). The registration message
containing information of the resource is first delivered to the NS responsible for
the resource ID created from lowest values of each list (i.e. (v1n1 , ..., vi1 , ..., vd1 )).
This NS becomes the root node of the spanning binomial tree. The NS creates new
destination resource IDs, which correspond to lower level nodes of the SBT and
then forwards the message to these destination resource IDs. The NSs receiving
the message will create another destination resource IDs based on the SBT and
relay the message recursively (Fig. 4b).
3.5

Query resolution


In the case of an exact query, the agent NS (i.e. the NS that the query host
sends the query message) will map the query resource name to queried resource
IDs and select the nearest query resource ID as the destination resource ID. The
query message including the queried resource name will be sent to the destination
resource ID based on the CAN routing algorithm. The NS responsible for the
resource ID will lookup its database to find the queried information and send the
information back to the agent NS.
In the case of a range query, an agent NS will map the query range to a range
query segment in the resource ID space. A query message will be broadcasted to
all NSs whose zones overlap the segment. NSs receiving the query message will


check their database to find the information of resources whose names match the
query.

([0,2],[2,4],[2,4])

([0,2],[0,2],[2,4])

([2,4],[2,4],[2,4]) ([4,6],[2,4],[2,4])

([2,4],[0,2],[2,4]) ([4,6],[0,2],[2,4])

([2,4], [0,2], [0,2])
d=3, dr=0
d=3, dr=1

d=1, dr=1


({1},{1})
([1,5],[1,3],[1,3])

({2},{1})
([1,5],[1,3],[1,3])

([0,2], [0,2], [0,2])

([0,2], [2,4], [0,2])

([4,6], [0,2], [0,2]) ([2,4], [2,4], [0,2])
d=2, dr=1 d=1, dr=1 d=1, dr=1
d=1, dr=1
([4,6], [2,4], [0,2])
([2,4], [2,4], [2,4])

([0,2], [0,2], [2,4])
d=1, dr=1

([4,6], [0,2], [2,4])

d=1, dr=1

([4,6], [2,4], [2,4])

([0,2], [2,4], [2,4])

([2,4],[2,4],[0,2])

([4,6],[2,4],[0,2])


l=2, dr=1
([2,4], [0,2], [2,4])

d=2, dr=1

([0,2],[2,4],[0,2])

({1},{1})
([1,5],[1,3],[1,3])

d: broadcast dimension
dr: broadcast direction

([0,2],[0,2],[0,2])

({2},{1})
([1,5],[1,3],[1,3])

({3},{0})
({3},{1})
([4,6],[0,2],[0,2])
([1,5],[1,3],[1,3]) ([1,5],[1,3],[1,3])

({3},{0})
([1,5],[1,3],[1,3])

broadcast dimension
Message forwarding


({0},{0}) ([1,5],[1,3],[1,3])

broadcast direction

query segment

Agent NS

a) The spanning polynomial tree created from
a range query segment ([1,5],[1,3],[1,3])

b) Broadcast routing to NSs responsible for the range query
segment ([1,5],[1,3],[1,3])

Fig. 5. Broadcasting routing algorithm for sending a query message to the range query
segment based on spanning binomial tree

To reduce the cost of broadcasting, we propose a broadcasting algorithm based
on SBT to broadcast a query message to all NSs in a range query segment with
minimum number of sending messages. The number of messages to be sent is about
the number of NSs in the range query segment. The SBT is constructed on NSs
in the query segment (Fig. 5a). The query message is first sent to a NS in the
range query segment by CAN routing algorithm. The NS then forwards the query
message to its neighbor NSs which correspond to lower level nodes in the SBT.
These NSs then recursively forward the query message to their neighbor NSs which
correspond to lower level nodes of the SBT (Fig. 5b).

4

Evaluation


We evaluate the performance of SENS by simulation from the following aspects.
– Routing performance: Logical hop count required to route a query message to
a NS responsible for the query
– System efficiency: Replication number of resource information corresponding
to a resource name and number of NSs responsible for a query
– The degree of load balancing: The number of resource information stored at
each NS
We implemented a simple simulator to evaluate our system. We assume that
the number of attribute/value pairs in a resource name varies from 10 to 20. We
set the dimension number of the resource ID space to be 20. The default value is
randomly set to 12 coordinate values of an initial point assigned to each NS when
they newly join the overlay network.


The resource names are generated based on the Zipf dataset, which reflects
the popularity of an attribute/value pair based on a parameter called a rank. The
probability that an attribute/value pair appears in a resource name is in proportion
to 1/rα . Here, r is the rank of the attribute/value pair and α is a constant number.
We set α as 0.9 and r as a random number between 1 and the total number of
attribute/value pairs. Our data set has 400 attributes, each of which can take on
1024 values. Exact queries and range queries are also generated based on the same
Zipf dataset.
The next subsection shows our simulation results.
Routing performance

8

8


6

7

4

2

Evera ge routing hops for a n exa ct query

Hop count (hop)

Hop count

4.1

6
Size of a value range: 20%

5

Size of a value range: 30%

Evera ge pa th length

Size of a value range: 40%

0

4

1000

5000
20000
NS number

100000

a) Path length and evarage logical hop
count in the case of exact queries

0

4
8
12
Number of value rang
es per query

16

b) Everage logical hop count for a range
query in the case of range queries

Fig. 6. Routing performance of SENS

We first study the increase in the average path length (i.e. average logical hop
count required to route a message between two NSs) in the overlay network due to
the increase of the number of NSs. 100,000 messages are sent to random destination
resource IDs from a randomly selected NS. The simulation result (Fig. 6 a)) shows

that the average path length increases on a logarithm scale of NS number. In the
case of a 100,000-node system, the average path length is about 6.9 hops.
Fig. 6 a) also shows the logical hop count required to route an exact query
message to the NS responsible for the queried resource information. The average
logical hop count required to route an exact query message (5.0 hops in the case
of 20,000-node system) to the corresponding NS is smaller than the average path
length(5.9 hops in the case of 20,000-node system). It is because information of
a resource may be replicated in a number of NSs and the nearest resource ID is
always selected as the destination resource ID to delivery query messages.
We study the logical hop count required for a range query in 20,000-node SENS
system by fixing the size of each value range to be 20%, 30%, 40% of the maximum
coordinate value and increasing the number of value ranges to be queried. As shown
in Fig. 6 b), the average logical hop count required to route a range query message
to responsible NSs increases only 1.3 hop when the number of value ranges to be
queried increases from 1 to 16. It means that our broadcast routing protocol can
achieve good routing performance.


System efficiency

Number of replications
per name

10
8
6

10
4
2

0

10

12

14

16

18

20

Attribute/value pair number per resource name
a) Average resource ID number and replication
number corresponding to a resource name

1

Size of a value range: 40%
Size of a value range: 20%
Size of a value range: 30%

100

100

Number of NS


Average number of replication
number per name
Average number of resource
ID s per name

Number of resource
per name

4.2

10

1

0

4

8

12

16

Number of value ranges per query
b) Number of NSs to be responsible for
a range query in the case of range queries

Fig. 7. Evaluation result of system efficiency


We study the average number of resource IDs and average number of replications per a resource name in 20,000-node SENS system. As shown in Fig. 7 a), the
resource ID number per resource name is large and the resource ID number increases on a exponential scale of the number of attribute/value pairs in a resource
name. However, because a NS may be responsible for a number of resource IDs
of a resource, the replication number per resource name is relatively small. For a
20-attribute/value pair resource name, the resource ID number per resource name
is 94.7 while the replication number per resource name is 7.8 on average.
The average number of NSs to be queried in a range query increases exponentially with the number of value range per query (Fig. 7 b)). It is because the
volume of query segment increases exponentially. However, we consider that number of NSs to be queried is small enough to be viable. In the case the number of
value ranges is 10 and each value range is 20% of the value, the average number
of queried NSs is 2.2 while in the case the number of value ranges is 12 and each
value range is 30% of the maximum value, the average number of queried NSs is
7.1.

4.3

Load balancing

In order to evaluate the degree of load balancing in SENS system, we measured
the number of resource names stored on each NS by delivering 1,000,000 resource
names to 20,000-nodes SENS system. Number of attribute/value in a resource
name varies from 10 to 20.
Fig. 8.a shows the attribute/value pair appearance probability in resource
names. Fig. 8.b shows the ratio of resource name number stored in each NSs to total name number. Our simulation shows that even there are several attribute/value
pairs which appear in resource names with high probability (about 13.96% of total
number of resource names), the maximum number of resource names stored in a
NS is not over 0.23 % of total number of resource names (Fig. 8.b).


1.00%
0.10%

0.01%
1

10

100

1000

10000 100000

Attribute/value pair ID

(a)

1.000%

1.000%

0.100%

0.100%

0.010%

0.010%

Percentage of total
queries(%)
Percentage of total

resource names (%)

0.001%
1

10

100

1000

10000

Percentage of total
resource names (%)

10.00%

Percentage of total queries (%)

Percentage of
total resource name (%)

100.00%

0.001%
100000

NS ID


(b)

Fig. 8. a) Attribute/value pair appearance probability in resource names b) Number of
resource information stored in each NS

5

Applications of SENS

By the use of attribute/value pairs, our naming scheme can sufficiently express
properties of resources and range queries for retrieving information of resources.
Descriptive naming schemes have been used in a number of conventional researches
[8, 13, 14]. We also consider that our naming system can provide an information
retrieval service for existing naming schemes such as Resource Description Framework (RDF) [5] and Directory service [6].
In the RDF, since the URI is not always known in advance, we must search
information of resources including the URI based on the properties of resources.
Our system can rename such a resource with attribute/value pairs corresponding
to property-types and property values of the resource. The RDF name of a resource
will be considered as a full name of the resource and will be kept in the database of
naming servers as resource information for further searches (i.e. search for resources
whose full name satisfies given constraints).
Our naming scheme can also express a resource that is named in a Directory service. A hierarchical tree structure in Directory services can clearly express the relationship between attribute/value pairs, which is necessary when attribute names of
different attribute/value pairs are very similar. Our naming scheme can express the
relation between attribute/values pairs by the use of attribute hierarchies. For example, the format of audio and video for an audio/video converter can be expressed
as (string audio.input.format = “AVI”, string audio.output.format = “wav”, string
video.input.format = “mpeg2”, string video.output.format = “mov”).

6

Conclusions


We have proposed SENS naming system which can retrieve information of resources by exact queries and multi-attribute range queries over resource names.
In our system, resource information distribution and query are performed based
on resource ID space constructed on the overlay network of NSs. Our mapping
scheme between resource names and resource IDs allows SENS system to realize exact queries and multiple-attribute range queries effectively but still achieve
good load-balancing even in the case the distribution of attribute/value pairs in
resource names is skewed. We also propose a multicast routing algorithm to deliver


resource information to corresponding NSs and a broadcast routing algorithm to
send query messages to all NSs in the query segment with minimum cost. Our
simulation results validated the system’s routing performance and load balancing
properties.
Our future work is to implement the SENS system in a test-bed system and
develop applications for the SENS system.

Acknowledgment
This work is supported partially by the National Fundamental IT research project
“Modern Methods for Building Intelligent Systems”

References
1. M. Satyanarayanan, ”Pervasive computing: Vision and Challenges” , IEEE Personal
Communications, vol. 8, no. 4, pp. 10-17, Aug. 2001
2. I. Foster, C. Kesselman, J. Nick, S. Tuecke, ”Grid Services for Distributed System
Integration”, vol. 35, iss. 6, IEEE Computer, June 2002
3.
4. M. Wahl, T. Howes and S. Kille, “Lightweight Directory Access Protocol (v3)”, RFC
2251, Dec. 1997
5. G. Klyne and J. Carroll, “Resource Description Framework (RDF): Concepts and
Abstract Syntax”, />TR/rdf-concepts/, W3C Recommendation, Feb. 2004

6. W. Yeong, T. Howes and S. Kille, “X.500 Lightweight Directory Access Protocol”,
RFC 1487, July 1993
7. W. Adjie-Winoto, E. Schwartz, H. Balakrishnan and J. Lilley, “The Design and Implementation of an Intentional Naming Systems”, In P roceedings of ACM Symposium
on Operating Systems Principles, Dec. 1999
8. A. Carzaniga, D. Rosenblum and A. Wolf, “Design and Evaluation of a Wide-Area
Event Notification Service”, vol. 19, no. 3, ACM Transactions on Computer Systems,
August 2001
9. S. Ratnasamy, P. Francis, M. Handley and R. Karp, “A Scalable Content-Addressable
Network”, In P roceedings of ACM SIGCOMM’01, Aug. 2001
10. I. Stoica, R. Morris, D. Karger, M.F. Kaashoek, H. Balakrisnan, “Chord: A Scalable peer-to-peer lookup service for Internet applications”, In P roceedings of ACM
SIGCOMM’01, August 2001
11. A. Rowstron and P. Druschel, “Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems”, In P roceedings of IFIP/ACM International
Conference on Distributed Systems Platforms, Nov. 2001
12. M. Cai, M. Frank, J. Chen, P. Szekely, “MAAN: A Multi-attribute Addressable Network for Grid Information Services”, In P roceedings of Fourth International Workshop on Grid Computing, Nov. 2003
13. D. Oppenheimer, J. Albrecht, D. Patterson, and A. Vahdat, “Scalable Wide-Area
Resource Discovery”, UC Berkeley Technical Report UCB//CSD-04-1334, 2004
14. A. Bharambe, M. Agrawal and S. Seshan, “Mercury: Supporting Scalable MultiAttribute Range Queries”, In P roceedings of ACM SIGCOMM’04, 2004.
15. C. Tang, Z. Xu and S. Dwarkadas, “Peer-to-Peer Information Retrieval Using SelfOrganizing Semantic Overlay Networks”, In P roceedings of ACM Sigcomm 03, August 2003
16. S. Johnsson and C. Ho, “Optimum Broadcasting and Personalized Communication
in the Hypercube”, vol. 38, no. 9, pp 1249-1268, IEEE Transaction on Computers Sep.
1989



×