Tải bản đầy đủ (.pdf) (20 trang)

Integrated Research in GRID Computing- P13 ppt

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (1.28 MB, 20 trang )

232
INTEGRATED RESEARCH IN GRID COMPUTING
• Information: A scheduling instance must have coherent access to static
and dynamic information about resources' characteristics (computational,
data, networks, etc.), resource usage records, job characteristics, and, in
general, services involved in the scheduling process. Moreover, it must
be able to publish and update its own static and dynamic attributes to
make them available to other scheduling instances. These attributes in-
clude allocation properties, local scheduling strategies, negotiation mech-
anisms, local agreement templates and resource information relevant to
the scheduling process [5]. It can be, in addition, useful to provide the
capability to cache historical information.
• Search: This function can be exploited to perform optimised informa-
tion gathering on resources. For example, in large scale Grids is neither
necessary nor efficient to collect information about every resource, but
just a subset of "good" candidate resources. Several search strategies
can be implemented (e.g. *'best fit" searches, P2P searches with caching,
iterative searches, etc.). Every search should include at least two param-
eters:
the number of records requested in the reply and a time-out for the
search procedure.
• Monitoring: A scheduling infrastructure can monitor different attributes
to perform its functions: for instance the status of
an
SLA to check if it is
not violated, the execution of a job to undertake scheduling or corrective
actions, or the status of a scheduling description throughout its lifetime
for user feedback.

Forecasting: In order to calculate a schedule it can be useful to rely on
forecasting services to predict the values of the quantities needed to apply


a scheduling strategy. These forecasts can be based on historical records,
actual and/or planned values.
Performance Evaluation: The description of a job to be scheduled
can miss some information needed by the system to apply a schedul-
ing strategy. In this case it can be useful to apply performance evaluation
methodologies based on the available job description in order to predict
the unknown information.
Reservation: To schedule complex jobs as workflows and co-allocated
tasks,
as well as jobs with QoS guarantees, it is in general necessary to
reserve resources for particular time frames. The reservation of a re-
source can be obtained in several ways: automatically (because the local
resource manager enforces it), on demand (only if explicitly requested
from the user), etc. Moreover, the reservations can be restricted in time:
for example only short-time reservations (i.e. with a finite time horizon)
A Proposal for a Generic Grid Scheduling Architecture 233
can be available. This function can require interaction with local re-
source managers, can be in charge of keeping information about allotted
reservations, and reserve new time frames on the resource(s).
• Co-allocation: This function is in charge of the mechanisms needed
to solve co-allocation scheduling problems, in which strict constraints
on the time frames of several reservations must be respected (e.g. the
execution at the same time of two highly interacting tasks). It can rely
on a low-level clock synchronisation mechanism.
• Planning: When dealing with complex jobs (e.g. workflows) that need
time-dependent access to and coordination of several objects like ex-
ecutables, data, or network paths, a planning functionality, potentially
built on top of a reservation service, may provide the necessary service.
• Negotiation: To reach an agreement on a particular QoS, the interacting
partners may need to follow particular rules to exchange partial agree-

ments in order to reach a final decision (e.g. who is in charge of provid-
ing the initial SLA template, who may modify what, etc.). This function
should include a generic mechanism to implement several negotiation
rules.
• Execution: An execution entity is responsible to actually execute the
scheduled jobs. It must interact with the local resource manager to per-
form the actions needed to run all the components of a job (e.g. staging,
activation, execution, clean up). Usually it interacts with a monitoring
system to control the status of the execution.
• Banking: The accounting/billing functionalities are performed by a
banking system. It must provide interfaces to access accounting infor-
mation, to charge for reservations or use resource usage, and to refund,
e.g. in case of SLA failure or violation.
• Translation: The interaction with several services that can be imple-
mented differently can force to "translate" information about the schedul-
ing problem to map the semantics of one system to the semantics of
another.
• Data Management Access: Data transfers can be included in the de-
scription of
jobs.
Although data management scheduling shows several
similarities with job scheduling, it is considered a distinct, stand-alone
functionality, because the former shows significant differences compared
to the latter (e.g. replica management and repository information) [9].
The implementation of a scheduling system may need access to data
management facilities to program data transfers with respect to planned
234
INTEGRATED RESEARCH IN GRID COMPUTING
job allocations, data availability and eligible costs. This functionality
can rely on previously mentioned ones, like information management,

search, agreement and negotiation.
• Network Management
Access:
Data transfers as well as job interactions
may need particular network resources to achieve a certain QoS level
during their execution. As in the case of data management access, due
to its nature and complexity, network management is considered a stand-
alone functionality that should be exploited by scheduling systems if
needed [10]. This functionality can rely on previously mentioned ones,
like information management, search, agreement and negotiation.
4,
Scheduling Instance
It is possible to consider the different blocks of the examples in Section 2 as
particular implementations of
a
more general software entity called scheduling
instance. In this context, a scheduling instance is defined as a software entity
that exhibits a standardised behaviour with respect to the interactions with other
software entities (which may be part of a GSA implementation or external
services). Such scheduling entities cooperate to provide, if
possible,
a solution
to scheduling problems submitted by users, e.g. the selection, planning and
reservation of resource allocations for
a
job [5].
The scheduling instance is the basic building block of a scalable, modular
architecture for scheduling tasks, jobs, workflows, or applications in Grids. Its
main function is to find a solution to a scheduling problem that it receives via
a generic input interface. To do so, the scheduling instance needs to interact

with local resource management systems that typically control the access to the
resources. If a scheduling instance can
find
a
solution for
a
submitted scheduling
problem, the generated schedule is returned via a generic output interface.
From the examples depicted above it is possible to derive a high-level model
of operations that a scheduling instance can exploit to provide a solution to a
scheduling problem:
• The scheduling instance can try to solve the whole problem by itself
interacting with local resource managers it has access to.
• If
it
can partition the problem into several scheduling
sub-problems.
With
respect to the different sub-problems it can
- try to solve some of the sub-problems,
- negotiate with other scheduling instances to transfer unsolved sub-
problems to them,
- wait for potential solutions coming from other scheduling instances,
or
A Proposal for a Generic Grid Scheduling Architecture 235
- aggregate localised solutions to
find
a
global solution for
the

original
problem.
• If the partition of the problem is impossible or no solution can be found by
aggregating sub-problem solutions, the scheduling instance can perform
one of the following actions:
- It can report back
to the
entity that submitted the scheduling problem
that it cannot find a solution, or
- it can
* negotiate with other scheduling instances to forward the whole
problem, or
* wait for a solution to be delivered by the scheduling instance
the problem has been forwarded to.
A generic Grid Scheduling Architecture will need to provide these operations,
but actual implementations do not need to implement all of them. As this model
of operations is modular it permits to implement several different scheduling
infrastructures, like the ones depicted in the Grid scheduling scenarios.
Apart from the operations a generic architecture should support we can infer
from the scenarios that a generic scheduling instance should be able to:
• interact with local resource managers;
• interact with external services that are not defined in the Grid Schedul-
ing Architecture, like information, forecasting, submission, security or
execution services;
• receive a scheduling problem (from other scheduling instances or exter-
nal submission services), calculate a schedule, and return a scheduling
decision;
• split a problem in sub-problems, receive scheduling decisions, and merge
them into a new one;
• forward problems to other scheduling instances.

However, an instance might exhibit only a subset of such abilities, which
depends on its modus operandi and the objectives of its provider. If a scheduling
instance is able to cooperate with other instances, it must exhibit the ability to
send problems or sub-problems, and receive scheduling results. Looking at
such an instance in relation to others, we call higher-level scheduling instances
the ones that are able to directly forward a problem to that instance, and lower-
level scheduling instances the ones that are able to directly accept a scheduling
problem from that instance. A single instance must act as a decoupling entity
236
INTEGRATED RESEARCH IN GRID COMPUTING
Input Scheduling Problems Output Scheduling Decisions
Local Resource
Managers Interaction
< •
Q
n h
Sclheduljing
! Inlstance
u •
^ • External Services Interaction
Output Scheduling Problems Input Scheduling Decisions
Figure
4.
Functional interfaces of
a
scheduling instance
between the actions performed at higher and lower
levels:
it
is

neither concerned
with the instances which previously dealt with the problem (i.e. it has been
submitted by an external service or forwarded by other instances as a whole
problem or as a sub-problem), nor with the actions that the following instances
will undertake to solve the problem. Every instance will need to know solely
the problem it has to solve and the source of the original scheduling problem
to avoid or resolve potential forwarding issues.
From a component point of view the abilities described above are expressed
as interfaces. In general, the interfaces of a scheduling instance can be divided
into two main categories: functional interfaces and non-functional interfaces.
The former are necessary to enable the main behaviours of the scheduling
instance, while the latter are exploited to manage the instance itself (creation,
destruction, status notification, etc.).
With respect to this paper we only took the functional interfaces into account.
These are essential for a scheduling instance to support the creation of a Grid
Scheduling Architecture. Security services, for instance, are from a functional
point of view not strictly needed to schedule a
job,
therefore they are considered
as external services or non-functional interfaces.
In Figure 4 the following functional interfaces that a scheduling instance can
expose are depicted:
Input Scheduling Problems Interface The methods of this interface are re-
sponsible to receive a description of a scheduling problem that must be
solved, and start the scheduling process. This interface is not intended
to accept jobs directly from users; rather an external submission ser-
vice (e.g. portal or command line interface) can collect the scheduling
problems, validate them and produce
a
neutral representation accepted as

A Proposal for a Generic Grid Scheduling Architecture 237
input by this interface. In this way, this interface is fully decoupled from
external interactions and can be exploited to compose several scheduling
instances, where an instance can forward a problem or submit a sub-
problem to other instances using this interface.
Every scheduling instance must implement this interface.
Output Scheduling Decisions Interface The methods of
this
interface are re-
sponsible to communicate the results of the scheduling process started
earlier with a scheduling problem submission. Like the previous one,
this interface is not intended to communicate the results directly to a
user, rather to a visualisation or reporting service. Again, we can exploit
this decoupling in a modular way: if an instance receives a submission
from another one, it must use this interface to communicate the results
to the submitting instance.
Every scheduling instance must implement this interface.
Output Scheduling Problems Interface If an instance is able to forward a
whole problem or partial sub-problems to other scheduling instances, it
needs the methods of this interface to submit the problem to lower level
instances.
Input Scheduling Decisions Interface If an instance is able to submit prob-
lems to other instances, it must wait until a scheduling decision is pro-
duced from the one to which the problem was submitted. The methods
of this interface are responsible for the communication of the scheduling
results from lower level instances.
Local Resource Managers Interface The final goal of a scheduling process is
to find an allocation of
the jobs
to the resources. This implies that sooner

or later during the process it is necessary for a scheduling instance to
interact with local resource managers. While some scheduling instances
can be dedicated to the "routing" of
the
problems, others interact directly
with local resource managers to find suitable schedules, and propagate
the answers in a neutral representation back to the entity that submitted
the scheduling problem. Different local resource managers can require
different interaction interfaces.
External Services Interaction Interfaces If
an
instance must interact with an
entity that is neither a local resource manager nor another scheduling
instance, it needs an interface that permits to communicate with that
external service. For example, some instances may need to gain access
to information, billing, security and/or performance predictor services.
Different external services can require different interaction interfaces.
238 INTEGRATED RESEARCH IN GRID COMPUTING
5. Conclusion
In this paper we discuss a general model for Grid scheduling. This model
is based on a basic, modular component we call scheduling instance. Sev-
eral scheduling instance implementations can be composed to build existing
scheduling scenarios as well as new ones. The proposed model has no claim
to be the most general one, but the authors consider this definition a good
starting point to build a general Grid Scheduling Architecture that supports
cooperation between different scheduling entities for arbitrary Grid resources.
Future work aims at the specification of the interaction of the Grid scheduling
instance to other scheduling instances as well as to other middleware services.
This work will be carried out by GGF's Grid Scheduling Architecture Research
Group [11] and the Virtual Institute on Resource Management and Schedul-

ing [12] within the CoreGRID project. The outcome of this activity should
yield a common Grid scheduling architecture that allows the integration of sev-
eral different scheduling instances that can interact with each other as well as
be exchanged with domain-specific implementations.
References
[1] R. Yahyapour and Ph. Wieder (eds.). Grid Scheduling Use Cases.
Grid Forum Document, GFD.64, Global Grid Forum, March 26, 2006.
<
[2] Global Grid Forum. Web site.
1
July 2006 <>.
[3] I. Foster, C. Kesselman, and S. Tuecke. The anatomy of the Grid - Enabling Scalable
Virtual Organizations. In Grid Computing - Making the Global Infrastructure a Reality,
F Berman, G. C. Fox, and A. J. G. Hey (eds.), pp. 171-197. John Wiley & Sons Ltd.,
2003.
[4] J. M.
Schopf.
Ten Actions When Grid Scheduling - The User as a Grid Scheduler. In
Grid
Resource
Management - State of
the
Art and
Future
Trends,
J. Nabrzyski, J.
Schopf,
and J. Weglarz (eds.), pp.
15-23.
Kluwer Academic Publishers, 2004.

[5] U. Schwiegelshohn and R. Yahyapour. Attributes for Communication between Schedul-
ing Instances. Grid Forum Document, GFD.6, Global Grid Forum, December, 2001.
<
[6] V. Sander (ed.). Networking Issues for Grid Infrastructure. Grid Fo-
rum Document, GFD.37, Global Grid Forum, November 22, 2004.
<
[7] U. Schwiegelshohn, R. Yahyapour, and Ph. Wieder. Resource management for Future
Generation Grids. In Future Generation Grids, Proceedings of
the
Workshop on Future
Generation Grids, V. Getov, D. Laforenza, and A. Reinefeld (eds.), pp. 99-112. Springer,
2004.
ISBN: 0-387-27935-0.
[8] J. Bouman, J. Trienekens, and M. van der Zwan. Specification of Service Level Agree-
ments, Clarifying Concepts on the Basis of Practical Research. In Proc. of Software
Technology
and Engineering Practice 1999 (STEP '99), pp. 169-178, 1999.
A Proposal for a Generic Grid Scheduling Architecture 239
[9] R. W. Moore. Operations for Access, Management, and Transport at Remote
Sites.
Grid Forum Document, GFD.46, Global Grid Forum, May 4, 2005.
<
[10] D. Simeonidou and R. Nejabati (eds.). Optical Network Infrastructure for
Grid. Grid Forum Document, GFD.36, Global Grid Forum, August, 2004.
<
[11] Grid Scheduling Architecture Research Group (GSA-RG). Web site. 1 July 2006
<
[12] CoreGRID Virtual Institute on Resource Management and Scheduling. Web site. 1 July
2006 <
GRID SUPERSCALAR ENABLED

P-GRADE
PORTAL
Robert Lovas, Gergely Sipos and Peter Kacsuk
Computer and Automation Research Institute, Hungarian Academy of Sciences (MTA-SZTAKI)



Raiil Sirvent, Josep M. Perez and Rosa M. Badia
Barcelona Supercomputing Center and
UPC,
SPAIN



Abstract One of the current challenges of the Grid scientific community is to provide
efficient and user-friendly programming tools. GRID superscalar allows pro-
grammers to write their Grid applications as sequential programs. However, on
execution, a task-dependence graph is built and the inherent concurrency of the
task is exploited and executed in
a
Grid.
P-GRADE Portal is a workflow-oriented
grid portal with the main goal to cover the whole lifecycle of workflow-oriented
computational grid applications. In this paper the authors discuss the different
options taken into account to integrate these two frameworks.
Keywords: Grid computing. Grid programming models, Grid workflows, Grid portals
242
INTEGRATED RESEARCH IN GRID COMPUTING
1,
Introduction

One of the issues that raises current interest in the Grid community and in the
scientific community in general is the application programming in Grids. While
more and more scientific groups aims to use the power of the Grids, the diffi-
culty of porting applications to the Grid (what sometimes is called application
"gridification"
may be an obstacle to the adaptation of this technology.
Examples of efforts for provide Grid programming models are ProActive,
Ibis,
or
ICENI.
ProActive [15] is a Java library for parallel, distributed and con-
current computing,
also
featuring mobility and security in
a
uniform framework.
With a reduced set of simple primitives, ProActive provides a comprehensive
API masking the specific underlying tools and protocols used, and allowing to
simplify the programming of applications that are distributed on a LAN, on a
cluster of PCs, or on Internet Grids. The library is based on an active object
pattern, on top of which a component-oriented view is provided.
The Ibis Grid programming environment [16] has been developed to provide
parallel applications with highly efficient communication API's. Ibis is based
on the Java programming language and environment, using the "write once, run
anywhere" property of Java to achieve portability across a wide range of Grid
platforms. Ibis aims at Grid-unaware applications. As such, it provides rather
high-level communication API's that hide Grid properties and fit into Java's
object model.
ICENI [17] is
a

grid middleware framework with an added value
to
the lower-
level grid services. It is a system of structured information that allows to match
applications with heterogeneous resources and services, in order to maximize
utilization of the grid fabric. Applications are encapsulated in a component-
based manner, which clearly separates the provided abstraction and its possibly
multiple implementations. Implementations are selected at runtime, so as to
take advantage of dynamic information, and are selected in the context of the
application, rather than a single component. This yields to an execution plan
specifying the implementation selection and the resources upon which they
are to be deployed. Overall, the burden of code modification for specific grid
services is shifted from the application designer to the middleware
itself.
Tools,
as the P-GRADE Portal or GRID superscalar, aims to ease the uti-
lization of the Grid but cover different areas from an end-user's point of view.
While P-GRADE Portal is a graphical-based tool, GRID superscalar is based
on imperative language programs. Although there is some overlap in function-
ality, both tools show a lot of complementarities and it is very challenging to
make them inter-operable. The integration of these tools may be a step towards
achieving the idea of the "invisible" Grid for the end-user.
This work has been developed in the context of the NoE CoreGRID. More
specifically, in the virtual institute "Systems, Tools and Environments" (WP7)
GRID superscalar enabled P-GRADE portal lA^i
and aims to contribute to the task 7.3 "Integrated Toolkit". The "Integrated
Toolkit" will provide means to develop Grid-unaware applications, for execu-
tion in the Grid in a way transparent to the user and increasing the performance
of the application.
In this paper the integration of

the
P-GRADE Portal and
the
GRID superscalar
is discussed. In Section 2 the P-GRADE Portal is presented and Section 3
covers the description of the GRID superscalar framework. Then in Section 4
a comparison between both tools is given. Following that, Section 5 discusses
an integration solution, and at the end of this paper Section 6 presents some
conclusions, related work and future work.
2.
P-GRADE Portal
The P-GRADE Portal [1] is a workflow-oriented grid portal with the main
goal to cover the whole lifecycle of workflow-oriented computational grid ap-
plications. It enables the graphical development of workflows consisting of
various types of executable components (sequential, MPI or PVM programs),
executing these workflows in Globus-based grids relying on user credentials,
and finally analyzing the correctness and performance of applications by the
built-in visualization facilities.
A P-GRADE Portal workflow is an acyclic dependency graph that connects
sequential and parallel programs into an interoperating set of
jobs.
The nodes
of such a graph are
jobs,
while the arc connections define the execution order of
the jobs and the data dependencies between them that must be resolved by the
workflow manager during the execution. An ultra-short range weather forecast
(so-called nowcasting) grid application [2] is shown in Fig. 1 as an example
for a P-GRADE Portal workflow.
Nodes (labelled as delta, cummu, visit, satel, and ready in Fig. 1 'Workflow

editor') represent jobs while rectangles (labelled by numbers) around the nodes
are called ports and represent data files that the corresponding jobs expect or
produce. Directed arcs interconnect pairs of input and output files if an output
file of a job serves as an input
file
for another
job.
The semantics of the workflow
execution means that a job (a node of the workflow) can be executed, if and
only if all of its input files are available, i.e. all the jobs that produce input
files for the job have successfully terminated, and all the user-defined input
files are available either on the portal server and at the pre-defined grid storage
providers. Therefore, the workflow describes both the control-flow and the
data-flow of the application. If all the necessary input files are available for a
job,
then DAGMan [3], the workflow manager used in the Portal transfers these
files (together with the binary executable) to the site where the job has been
allocated by the developer for execution. Managing the transfer of files and
244
INTEGRATED RESEARCH IN GRID COMPUTING
^VJ.^'^Jr^
*i**» |4lLJ
h:*;.'/*•
J.hDtt.t«*lj.riu:s';t-;;'flr<HC-ee.^Oior*-^-actor •OM^t^^«*&fid>Z
"3"fl^"
~3S°° ig)^
"**'
ffc
»-t3FR«OI
Em.M 'i flW'

, n„ u,
wmr
^jcj
wot
kilo w
Etlit Options Halp
portal
, 5:!
* ^ a Xl Si
On
jlJO
j
25
50 75 100
J


o^'^-
w*wlr
m.
.
:.:^ ™ _ ^>i:;^
;
H r ,
[
! lob 1st
rWorkflow;
Job ,
Mostnonic
;

nowcastjbis;
Status
submitted
cummu
:
nO.ikpc.NMinte.hu iHHHI
delta nO.hpcc.sztala.tiu ij^gOm
LLogsJ
Out 1 -
Out 1 -
ready nO.hpcc.sztaki.hu itiit
satei iperstfaJ.cpc.wmii. finished
J
om
] •
visib nOJ)pcc.szidki.riu HHI^Ii
0"^
1'
[ Outpu
N/'A
•Message:
Job
list refreshed.
.,___,_„_^
'
^.'"•l":'
'••^rZT^^^^^^^^^^^^^^^^^^^
'
- : •-*
.'""«•

e :.;• * ^ • i
.
.
,«.« «» «.— ^^^r
„., ,
ja««
»»*
-P
A
IDBD
delta
MP'
la.
, '•'
p^rrnffitt;
i.
>ipi_;
e
c
s
•Virfsiil
•:
Mt^l
.
•tl
Q
saiel
MPI
'•^
S.V |D^'

fBiwJy
SEP
Figure 1. Meteorological application
in
P-GRADE Portal; workflow manager
and
workflow
description with status information, multi-level visualization
of
a successful execution
GRID superscalar enabled P-GRADE portal 245
recognition of the availability of the necessary files is the task of the workflow
manager subsystem.
To achieve high portability among the different grids, the P-GRADE Portal
has been built onto the GridSphere portal framework [14], and the Globus
middleware, and particularly those tools of the Globus Toolkit that are generally
accepted and widely used in production grids today. GridFTP, GRAM, MDS
and GSI [4] have been chosen as the basic underlying toolset for the Portal.
GridFTP services are used by the workflow manager subsystem to transfer
input, output and executable files among computational resources, among com-
putational and storage resources and between the portal server and the different
grid sites. GRAM is applied by the workflow manager to start up jobs on com-
putational resources. An optional element of
the
Portal, the information system
portlet, queries MDS servers to help developers map workflow components
(jobs) onto computational resources. GSI is the security architecture that guar-
antees authentication, authorization and message-level encryption facilities for
GridFTP, GRAM and MDS sites.
The choice of this infrastructure has been justified by connecting the P-

GRADE Portal to several grid systems like the GridLab test-bed, the UK Na-
tional Grid Service, and two VOs of the LCG-2 Grid (See-Grid and HunGrid
VOs).
Notice, that most of these grid systems use some extended versions of
the GT-2 middleware. The point is that if the compulsory GRAM, GridFTP
and GSI middleware set is available in a VO, then the P-GRADE Portal can be
immediately connected to that particular system.
Currently, the main drawback of P-GRADE portal is the usage of Condor
DAGMAN as the core of workflow manager, which cannot allow the user to
create cyclic graphs.
3,
GRID superscalar
The aim of GRID superscalar [5] is to reduce the development complexity of
Grid applications to the minimum, in such a way that writing an application for
a computational Grid may be as easy as writing a sequential application [6]. It
is a new programming paradigm for Grid-enabling applications, composed of
an interface, a run-time and a deployment center. With GRID superscalar a se-
quential application composed of
tasks
of
a
certain granularity is automatically
converted into a parallel application where the tasks are executed in different
servers of a computational Grid.
Figure 2 outlines GRID superscalar behavior: from a sequential application
code,
a task dependence graph is automatically generated, and from this graph
the runtime is able to detect the inherent parallelism and submit the tasks for
execution to resources in a grid.
246

INTEGRATED RESEARCH IN GRID COMPUTING
The interface is composed by calls offered by the run-time itself and by
calls defined by the user. The main program that the user writes for a GRID
superscalar application is basically identical to the one that would be written for
a sequential version of the application. The differences would be that at some
points of the code, some primitives of the GRID superscalar API are called.
For instance, OS-On and
GS-OffavQ
respectively called at the beginning and at
the end of the application. Other changes would be necessary for those parts of
the program where files are read or written. Since the files are the objects that
define the data dependences, the run-time needs to be aware of any operation
performed on
them.
The current version offers four primitives for handling files:
GS-Open, GS.Close, GS.FOpen and GSJ^Close. Those primitives implement
the same behavior as the standard open, close, fopen and fclose functions. In
addition, the
GSJBarrier
function has been defined to allow the programmers
to explicitly control the tasks' flow. This function waits until all Grid tasks
have finished. Also the GSSpeculative.End function allows an easy way to
implement parameter studies by dealing with notifications from the workers
in order to stop the computation when an objective has been reached. It is
important to point that several languages can be used when programming with
GRID superscalar (currently C/C++, Perl, Java and Shell script are supported).
Besides these changes in the main program, the rest of the code (including
the user functions) does not require any further modification.
The interface defined by the user is described with an IDL file where the
functions that should be executed in the Grid are included. For each of these

functions, the type and direction of the parameters must be specified (where
direction means if it is an input, output or input/output parameter). Parameters
can be files or scalars, but in the current version data dependencies will only be
considered in the case of files.
The basic set of files that a programmer provides for a GRID superscalar
application are a file with the main program, a file with the user functions
code and the IDL file. From the IDL file another set of files are automatically
generated by
the
code generation tool
gsstubgen.
This second set of
files
consists
of stubs and skeletons that convert
the
original application into
a
grid application
that calls the run-time instead of calling the original functions. Finally, binaries
for the master and workers are generated and the best way to do this it by using
the GS deployment center.
The GS deployment center is a Java based Graphical User Interface. Is able
to check the grid configuration and also performs an automatic compilation of
the main program in the localhost and worker programs in the server hosts.
GRID superscalar provides an underlying run-time that is able to detect the
inherent parallelism of the sequential application and performs concurrent task
submission. The components of the application that are objective of this con-
currency exploitation are the functions listed in the IDL file. Each time one
GRID superscalar enabled P-GRADE portal

247
Application code
Figure
2.
GRID superscalar behaviour
of these functions is called, the runtime system is called instead of the original
function. A node in a data-dependence graph is added, and file dependencies
between this function and functions called previously are detected. From this
data-dependence graph, the runtime can submit for concurrent execution those
functions that do not have any dependence between them. In addition to a data-
dependence analysis based on those input/output task parameters which are
files, techniques such as file renaming, file locality, disk sharing, checkpointing
or constraints specification with ClassAds [7] are applied to increase the ap-
plication performance, save computation time or select resources in the Grid.
The run-time has been ported to different grid middlewares and the versions
currently offered are: GT 2.4 [8], GT 4 [8], ssh/scp and Ninf-G2 [9].
Some possible limitations in current version of GRID superscalar are that
only give support
to a
single certificate per user at execution. Regarding resource
brokering, the selection
is
performed inside the run-time, but resource discovery
is not supported, and machines are specified statically by the user using the
GS deployment center. During the execution of the application the user can
change the machine's information (add, remove or modify hosts parameters).
Performance analysis of the application and the run-time has been done using
Paraver [11], but is not currently integrated in the runtime in such a way that
end-users can take benefit from it.
248

INTEGRATED RESEARCH IN GRID COMPUTING
3.1 GRID superscalar monitor
The GRID superscalar monitor (GSM) visualizes the task dependence graph
at runtime, so the user can study the structure of his parallel application and
track the progress of execution by knowing in which machine all task are exe-
cuting, and their status. The GSM is implemented using UDrawGraph (UDG)
[10],
an interactive graph visualization package from the University of Bremen.
Just as the GRID superscalar, the GSM assumes that the Grid consists of
a
mas-
ter machine, and worker machines. Additionally, for monitoring purposes, we
identify another machine, which does not belong to either of the aforemen-
tioned groups, the monitoring machine. By design, the GSM should be run in
the monitoring machine, as not to disturb or influence the Grid computation.
Although this is not mandatory, the GSM can also be located on the master or
on one of the worker machines, if desired.
Figure 3, shows an example of a GSM window, in which the user can work
manually with the graph resizing it or even changing the order of
the
nodes for a
better understanding of the dependencies between tasks. This is an easy way to
find out the degree of parallelism of
the
algorithm previously programmed with
GRID superscalar. As the graph can grow easily for more complex programs,
options for stopping the graph generation or for automatically scale / reorder
the graph are provided. The rest of the functionalities offered include saving
the graph for later reference, print the graph or export the graph to a graphical
format file (more precisely GIF, TIFF, JPEG or PNG formats).

, uDraw(Graph) 3,
4(a|
II
IFT:
node id: 13
status: BUSY
machine: pcmas.ac.upc.edu
©©©00
BS®
^ ^ ^ ^2^ (j7) (j8^ (^ (^
© 0 © 0 0 0 © ®
0 0
Figure 3. GRID superscalar monitor window
GRID superscalar enabled P-GRADE portal
249
The nodes representing the tasks in the GSM graph area are coloured in
order to visually show the current state of the task, as well as the machine that
computes the task. With this colour configuration, a user can easily see which
tasks
have
dependencies, which ones
have
their dependencies resolved,
the
tasks
currently running, and the tasks that have already finished their computation.
This is very important not only to monitor
the
execution of the GRID superscalar
application, but also allowing the user to understand why his application cannot

be executed with more parallelism.
4,
Comparison of
P-GRADE
Portal and GRID superscalar
Table 1. Comparison table between GRID superscalar and P-GRADE.
Products / Functionalities
Support for data
parallelism (graph
generation)
Support for
acyclic/conditional
dataflows
Compilation & staging of
executables
Thin client concept
Monitoring &
performance
visualization
GRID superscalar
Advanced
automatic detection of data
parallelism
YES
using C or PERL
YES
Deployment Center
NO
Globus client and full
GS

in-
stallation are needed
Limited
Monitoring only
P-GRADE
Manual
user has to express explic-
itly
NO
based DAGMAN/Condor
Limited
only run-time staging
is
sup-
ported
YES
only
a
Java-enabled browser
required
YES
multi-level visualization:
workflow/job/processes
Multi-Grid support
NO
only one certificate
YES
several certificates are han-
dled at the same time using
myproxy server

Multi-Grid support
Limited
by using "wrapper"
technology
YES
MPI/PVMjobsor
GEMCLA services
The aim of both the P-GRADE Portal and the GRID superscalar systems is
to ease the programming of grid systems, by providing high-level environments
on top of the Globus middleware. While the P-GRADE Portal is a graphical
interface that integrates a workflow developer tool with the DAGMan workflow
250
INTEGRATED RESEARCH IN GRID COMPUTING
manager systems, the GRID superscalar is a programming API and a toolset
that provide automatic code generation, as well
as
configuration and deployment
facilities. Table 1 outlines the differences between both systems.
5.
Overview of the solution
The main purpose of the integration of the GRADE Portal - GRID super-
scalar system is to create a high level, graphical grid programming, deployment
and execution environment that combines the workflow-oriented thin client con-
cept of the P-GRADE Portal with the automatic deployment and application
parallelisation capabilities of GRID superscalar. This integration work can be
realised in three different ways:
• Scenario 1: A new job type can be introduced in P-GRADE workflow
for a complete GRID superscalar application.
• Scenario 2: A sub-graph of P-GRADE workflow can be interpreted as a
GRID superscalar application.

• Scenario 3: A GRID superscalar application can be generated based on
the entire P-GRADE workflow description.
In case of the first two scenarios, the interoperability between the existing
P-GRADE workflow applications and GRID superscalar applications would be
provided by the system. On the other hand. Scenario 2 and 3 would enable the
introduction of new language elements
into
P-GRADE workflow description for
steering
the
data/control
flow
in
a more
sophisticated
way;
e.g. using conditional
or loop constructs similarly to UNICORE [13]. Scenario 3 was selected as the
most promising one and in this paper is discussed in detail.
Before the design and implementation issues, it is important to distinguish
the main roles of the site administrators, developers, and end-users which are
often mixed and misunderstood in academic grid solutions. The new integrated
system will support the following actors (see Fig. 4);
1 The site administrator, who is responsible for the installation and con-
figuration of the system components such as P-GRADE portal, GRID
superscalar, and the other required grid-related software packages.
2 The Grid
application
developer and deployer,
who

develops the workflow
application with the editor of P-GRADE portal, configures the access to
the Grid resources, and deploys
the
jobs with GS deployment center, and
finally optimizes the performance of the application using Mercury Grid
monitor and the visualisation facilities of P-GRADE portal.
GRID superscalar enabled P-GRADE portal 251
3 The
end-user,
who runs and interprets the results of the executions with
P-GRADE portal and its application-specific portlets from any thin client
machine.
Therefore, there are several benefits of the integrated solution from the end-
users'
points of view; they do not have to tackle the grid related issues.
System
administrator
(1)
Instal!
toofs
Grid
application
developer
e
.^-'-"(2)
Develop and
deploy
Bppihations
(3)

Grid user
Execute
applications
and
browse
results
Figure
4.
The roles in the integrated P-GRADE Portal - GRID superscalar system
In order to achieve these goals
a
new code generator GRPW2GS is integrated
in P-GRADE
portal.
It is responsible for the generation of
a
GRID superscalar-
compliant application from a workflow description (GRPW): an IDL file, a
main program file, and a functions file.
In the IDL file, each job of the actual workflow is listed as di function decla-
ration within the interface declaration. An example of generated GRID super-
scalar IDL file is shown in next lines:
interface workflowname {
void jobname (dirtype File filename, );
}
where workflowname ^nd jobname are unique identifiers, and inherited from
the workflow description. The
dirtype
can be
in

or out depending to the direction
of the type of the actual file. The actual value oi filename must depend on the
dependencies of
the
file. If it is a file without dependencies (i.e. input or output
of the entire workflow application), the filename can be the original name. On
252
INTEGRATED RESEARCH IN GRID COMPUTING
the other hand, if the file is an input of another job, a unique file identifier
is generated since in P-GRADE descriptions the filenames are not unique at
workflow level.
The following lines shows the structure of a main program file generated
based from a workflow.
#include "GS_master. h"
void main(int argc, char **argv) {
GSjQnO;
jobnamel (' filename
1',
);
jobname2(' filename2', );
GS_Cff(0);
}
For the generation of the functions file, two options have been taken into
consideration;
• using a simple wrapper technique for legacy code, or
• generating the entire application from source.
In the first case, the executable must be provided and up-loaded to the portal
server by the developer similarly to the existing P-GRADE portal solution.
The definitions of function calls (corresponding to the jobs) in the functions file
contain only system calls to invoke these executables, which are staged by the

P-GRADE portal to the appropriate site (selected by the resource broker).
In the second case, the application developer uploads the corresponding C
code
as
the 'body' of the function using the job properties dialogue window of
P-
GRADE
portal.
In this case, the developer gets
a
more
flexible
and architecture-
independent solution, since the GS deployment center can assist to create the
appropriate executables on Globus sites with various architectures.
After the automatic generation of
code,
the application developer can deploy
the code by GS deployment center, and the performance analysis phase can be
started. For this purpose, the execution manager of GRID superscalar has to
generate a Prove-compliant trace file to visualise the workflow-level execution.
It means the instrumentation of its code fragments by GRM, which are dealing
with the resource selection, job submission and file transfers. In order to get
a more detailed view, the parallel MPI code can be also instrumented by a
PROVE-compliant MPICH instrumentation library developed by SZTAKI.
Concerning the resource broker; the job requirement (defined in the job
attributes dialog window for each jobs) can be also passed to the GS broker
from the workflow editor in case of GT-2 grids, or the LCG-2 based resource
broker can be also used in P-GRADE portal.

×