Data Mining and Knowledge Discovery Handbook, 2 Edition part 47 potx

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (363.24 KB, 10 trang )

440 G. Peter Zhang
Dai Y., Nakano Y. (1998), Recognition of facial images with low resolution using a Hopﬁeld
memory model. Pattern Recognition ; 31:159-167.
Dasu T., Johnson T. (2003), Exploratory Data Mining and Data Cleaning. New Jersey: Wiley
.
De Groot D., Wurtz D. (1991), Analysis of univariate time series with connectionist nets: A
case study of two classical examples. Neurocomputing ;3:177-192.
Deboeck G., Kohonen T. (1998), Visual Explorations in Finance with Self-organizing Maps.
London: Springer-Verlag.
Delen D., Sharda R., Bessonov M. (2006), Identifying signiﬁcant predictors of injury severity
in trafﬁc accidents using a series of artiﬁcial neural networks Accident Analysis and
Prevention ; 38:434-444.
Dhar V., Chou D. (2001), A comparison of nonlinear methods for predicting earnings sur-
prises and returns. IEEE Transactions on Neural Networks ; 12:907-921.
Dia H. (2001), An object-oriented neural network approach to short-term trafﬁc forecasting.
European Journal of Operation Research ; 131:253-261.
Dittenbach M., Rauber A., Merkl, D. (2002), Uncovering hierarchical structure in data using
the growing hierarchical self-organizing map. Neurocompuing ; 48:199-216.
Doganis P., Alexandridis A., Patrinos P., Sarimveis H. (2006), Time series sales forecasting
for short shelf-life food products based on artiﬁcial neural networks and evolutionary
computing. Journal of Food Engineering ; 75:196-204.
Dutot A.L., Rynkiewicz J., Steiner F.E., Rude J. (2007), A 24-h forecast of ozone peaks
and exceedance levels using neural classiﬁers and weather predictions Modelling and
Software; 22:1261-1269.
Dutta S., Shenkar S. (1993), “Bond rating: a non-conservative application of neural net-
works.” In Neural Networks in Finance and Investing, Trippi, R., and Turban, E., eds.
Chicago: Probus Publishing Company.
Enke D., Thawornwong S. (2005), The use of data mining and neural networks for forecast-
ing stock market returns. Expert Systems with Applications ; 29:927-940.
Evans O.V.D. (1997), Discovering associations in retail transactions using neural networks.
ICL Systems Journal ; 12:73-88.

Fahlman S., Lebiere C. (1990), “The cascade-correlation learning architecture.” In Advances
in Neural Information Processing Systems, Touretzky, D., ed. .
Fletcher R. (1987), Practical Methods of Optimization 2
nd
. Chichester: John Wiley & Sons
.
Frean M. (1990), The Upstart algorithm: a method for constructing and training feed-forward
networks. Neural Computations ; 2:198-209.
Funahashi K. (1998), Multilayer neural networks and Bayes decision theory. Neural Net-
works ; 11:209-213.
Gallinari P., Thiria S., Badran R., Fogelman-Soulie, F. (1991), On the relationships between
discriminant analysis and multilayer perceptrons. Neural Networks ; 4:349-360.
Geman S., Bienenstock E., Doursat T. (1992), Neural networks and the bias/variance
dilemma. Neural Computation ; 5:1-58.
Gorr L. (1994), Research prospective on neural network forecasting. International Journal of
Forecasting ; 10:1-4.
He H., Wang J., Graco W., Hawkins S. (1997), Application of neural networks to detection
of medical fraud. Expert Systems with Applications ; 13:329-336.
Hebb D.O. (1949), The Organization of Behavior. New York: Wiley.
Hinton G.E. (1992), How neural networks learn from experience. Scientiﬁc American;
9:145-151.
21 Neural Networks For Data Mining 441
Hornik K., Stinchcombe M., White H. (1989), Multilayer feedforward networks are universal
approximators. Neural Networks ; 2:359–366.
Hopﬁeld J.J. (2558), (1982), Neural networks and physical systems with emergent collective
Hopﬁeld J.J., Tank D.W. (1985), Neural computation of decisions in optimization problems.
Biological Cybernetics ; 52:141-152.
Hu J.Q., Rose, E. (1995), On-line fuzzy modeling by data clustering using a neural network.
Advances in Process Control. , 4, 187-194.
Huang J.S., Liu H.C. (2004), Object recognition using genetic algorithms with a Huang Z.

Chen, H., Hsu, C.J. Chen, W.H. and Wu, S., Credit rating analysis with support vector
machines and neural networks: a market comparative study. Decision Support Systems ;
37:543-558.
Hopﬁeld’s neural model (1997). Expert Systems with Applications 1997; 13:191-199.
Jain L.C., Vemuri V.R. (1999), Industrial Applications of Neural Networks. Boca Raton:
CRC Press.
Kiang M.Y., Hu, M.Y., Fisher D.M. (2006), An extended self-organizing map network for
market segmentation—a telecommunication example Decision Support Systems ; 42:36-
47.
Kiang M.Y., Kulkarni U.R., Tam K.Y. (1995), Self-organizing map network as an interactive
clustering tool-An application to group technology. Decision Support Systems ; 15:351-
374.
Kim T., Kumara S.R.T., (1997), Boundary defect recognition using neural networks. Inter-
national Journal of Production Research; 35:2397-2412.
Kim T.Y., Oh K.J., Sohn K., Hwang C. (2004), Usefulness of artiﬁcial neural networks for
early warning system of economic crisis. Expert Systems with Applications ; 26:583-
590.
Kirkos E., Spathis C., Manolopoulos Y., (2007), Data Mining techniques for the detection of
fraudulent ﬁnancial statements. Expert Systems with Applications ; 32: 995-1003.
Kiviluoto K. (1998), Predicting bankruptcy with the self-organizing map. Neurocomputing ;
21:203-224.
Klein B.D., Rossin D. F. (1999), Data quality in neural network models: effect of error rate
and magnitude of error on predictive accuracy. Omega ; 27:569-582.
Kohonen T. (1982), Self-organized formation of topologically correct feature maps. Biolog-
ical Cybernetics ; 43:59-69.
Kolehmainen M., Martikainen H., Ruuskanen J. (2001), Neural networks and periodic com-
ponents used in air quality forecasting. Atmospheric Environment ; 35:815-825.
Law R. (2000), Back-propagation learning in improving the accuracy of neural network-
based tourism demand forecasting. Tourism Management ; 21:331-340.
Lee D.L. (2002), Pattern sequence recognition using a time-varying Hopﬁeld network. IEEE

Transactions on Neural Networks ; 13:330-343.
Lewis O.M., Ware J.A., Jenkins D. (1997), A novel neural network technique for the valua-
tion of residential property. Neural Computing and Applications ; 5:224-229.
Li W.J., Lee T., (2002), Object recognition and articulated object learning by accumulative
Hopﬁeld matching. Pattern Recognition; 35:1933-1948.
Lim G.S., Alder M., Hadingham P. (1992), Adaptive quadratic neural nets. Pattern Recogni-
tion Letters ; 13: 325-329.
Lisboa P.J.G., Edisbury B., Vellido A. (2000), Business Applications of Neural Networks :
The State-of-the-art of Real-world Applications. River Edge: World Scientiﬁc.
computational abilities. Proceedings of National Academy of Sciences; 79(8):2554-2558.
442 G. Peter Zhang
McCulloch W., Pitts W. (1943), A logical calculus of the ideas immanent in nervous activity.
Bulletin of Mathematical Biophysics; 5:115-133.
Min S.H., Lee J., Han I. (2006), Hybrid genetic algorithms and support vector machines for
bankruptcy prediction. Expert Systems with Applications ; 31: 652-660.
Minsky M. L., Papert S. A. (1969), Perceptrons. MA: MIT press.
Miyake S., Kanaya F. (1991), A neural network approach to a Bayesian statistical decision
problem. IEEE Transactions on Neural Networks ; 2:538-540.
Mozer M.C., Wolniewics R. (2000), Predicting subscriber dissatisfaction and improving re-
tention in the wireless telecommunication. IEEE Transactions on Neural Networks;
11:690-696.
Nag A.K., Mitra A. (2002), Forecasting daily foreign exchange rates using genetically opti-
mized neural networks. Journal of Forecasting ; 21:501-512.
Nelson M., Hill T., Remus T., O’Connor, M. (1999), Time series forecasting using neural
networks: Should the data be deseasonalized ﬁrst? Journal of Forecasting ; 18:359-367.
O’Connor N., Madden M.G. (2006), A neural network approach to predicting stock exchange
movements using external factors. Knowledge-Based Systems ; 19:371-378.
Paik J.K., Katsaggelos, A.K. (1992), Image restoration using a modiﬁed Hopﬁeld neural
network. IEEE Transactions on Image Processing ; 1:49-63.
Pajares G., Cruz J.M., Aranda, J. (1998), Relaxation by Hopﬁeld network in stereo image

matching. Pattern Recognition ; 31:561-574.
Panda C., Narasimhan V. (2007), Forecasting exchange rate better with artiﬁcial neural net-
work. Journal of Policy Modeling ; 29:227-236.
Parker D.B. (1985), Learning-logic: Casting the cortex of the human brain in silicon, Tech-
nical Report TR-47, Center for Computational Research in Economics and Management
Science, MIT.
Palmer A., Monta
˜
no J.J., Ses
´
e, A. (2006), Designing an artiﬁcial neural network for forecast-
ing tourism time series. Tourism Management ; 27: 781-790.
Partovi F.Y., Anandarajan M. (2002), Classifying inventory using an artiﬁcial neural network
approach. Computers and Industrial Engineering ; 41:389-404.
Petersohn H. (1998), Assessment of cluster analysis and self-organizing maps. International
Journal of Uncertainty Fuzziness and Knowledge-Based Systems. ; 6:139-149.
Prybutok V.R., Yi J., Mitchell D. (2000), Comparison of neural network models with ARIMA
and regression models for prediction of Houston’s daily maximum ozone concentrations.
European Journal of Operational Research ; 122:31-40.
Qi M. (2001), Predicting US recessions with leading indicators via neural network models.
International Journal of Forecasting ; 17:383-401.
Qi M., Zhang G.P. (2001), An investigation of model selection criteria for neural network
time series forecasting. European Journal of Operational Research ; 132:666-680.
Qiao F., Yang H., Lam, W.H.K. (2001), Intelligent simulation and prediction of trafﬁc ﬂow
dispersion. Transportation Research, Part B ; 35:843-863.
Raudys S. (1998), Evolution and generalization of a single neuron: I., Single-layer perceptron
as seven statistical classiﬁers Neural Networks ; 11:283-296.
Raudys S. (1998), Evolution and generalization of a single neuron: II., Complexity of statis-
tical classiﬁers and sample size considerations. Neural Networks ; 11:297-313.
Raviwongse R. Allada V., Sandidge T. (2000), Plastic manufacturing process selection

methodology using self-organizing map (SOM)/fuzzy analysis. International Journal of
Advanced Manufacturing Technology; 16:155-161.
Reed R. (1993), Pruning algorithms-a survey. IEEE Transactions on Neural Networks ;
4:740-747.
21 Neural Networks For Data Mining 443
Remus W., O’Connor M. (2001), “Neural networks for time series forecasting.” In Princi-
ples of Forecasting: A Handbook for Researchers and Practitioners, Armstrong, J. S. ed.
Norwell:Kluwer Academic Publishers, 245-256.
Reutterer T., Natter M. (2000), Segmentation based competitive analysis with MULTICLUS
and topology representing networks. Computers and Operations Research; 27:1227-
1247.
Richard, M. (1991), D., Lippmann, R., Neural network classiﬁers estimate Bayesian
aposteriori probabilities. Neural Computation ; 3:461-483.
Ripley A. (1993), “Statistical aspects of neural networks.” In Networks and Chaos - Statisti-
cal and Probabilistic Aspects, Barndorff-Nielsen, O. E., Jensen J. L. and Kendall, W. S.
eds. London: Chapman and Hall, 40-123.
Ripley A. (1994), Neural networks and related methods for classiﬁcation. Journal of Royal
Statistical Society, Series B ; 56:409-456.
Roh T. H. (2007), Forecasting the volatility of stock price index. Expert Systems with Ap-
plications ; 33:916-922.
Rosenblatt F. (1958), The perceptron: A probabilistic model for information storage and
organization in the brain. Psychological Review ; 65:386-408.
Rout S., Srivastava, S.P., Majumdar, J. (1998), Multi-modal image segmentation using a
modiﬁed Hopﬁeld neural network. Pattern Recognition ; 31:743-750.
Rumelhart D.E., Hinton G.E., Williams R.J. (1986), “Learning internal representation by
back-propagating errors.” In Parallel Distributed Processing: Explorations in the Mi-
crostructure of Cognition Press, Rumelhart, D.E., McCleland, J.L. and the PDP Research
Group, eds. MA: MIT.
Saad E.W., Prokhorov D.V., Wunsch, D.C. II. (1998), Comparative study of stock trend pre-
diction using time delay, recurrent and probabilistic neural networks. IEEE Transactions

on Neural Networks; 9:456-1470.
Salcedo-Sanz S., Santiago-Mozos R.,Bousono-Calzon, C. (2004), A hybrid Hopﬁeld
network-simulated annealing approach for frequency assignment in satellite communi-
cations systems. IEEE Transactions on System, Man and Cybernetics, Part B:108-116.
Sarle W.S. (1994), Neural networks and statistical models. Poceedings of the Nineteenth
Annual SAS Users Group International Conference, Cary, NC: SAS Institute.
Schumacher M., Robner R., Vach W. (1996), Neural networks and logistic regression: Part
I., Computational Statistics and Data Analysis ; 21:661-682.
Smith K.A., Ng, A. (2003), Web page clustering using a self-organizing map of user naviga-
tion patterns. Decision Support Systems ; 35:245-256.
Smith K.A., Willis R.J., Brooks M. (2000), An analysis of customer retention and insur-
ance claim patterns using data mining: a case study. Journal of the Operational Research
Society; 51:532-541.
Soulie F.F., Gallinari P. (1998), Industrial Applications of Neural Networks. River Edge, NJ:
World Scientiﬁc.
Suganthan P.N., Teoh E.K., Mital D.P. (1995), Self-organizing Hopﬁeld network for at-
tributed relational graph matching. Image and Vision Computing; 13:61-71.
Sun Z.Z., Yu S. (1995), Improvement on performance of modiﬁed Hopﬁeld neural network
for image restoration. IEEE Transactions on Image processing; 4:683-692.
Suykens J.A.K., Vandewalle J.P.L., De Moor B.L.R. (1996), Artiﬁcial Neural Networks for
Modeling and Control of Nonlinear Systems. Boston: Kluwer.
Swanson N.R., White H. (1995), A model-selection approach to assessing the information in
the term structure using linear models and artiﬁcial neural networks. Journal of Business
and Economic Statistics; 13;265-275.
444 G. Peter Zhang
Tatem A.J., Lewis H.G., Atkinson P.M., Nixon M.S. (2002), Supre-resolution land cover
pattern prediction using a Hopﬁeld neural network. Remote Sensing of Environment;
79:1-14.
Temponi C., Kuo Y.F., Corley H.W. (1999), A fuzzy neural architecture for customer satis-
faction assessment. Journal of Intelligent & Fuzzy Systems; 7:173-183.

Thieme R.J., Song M., Calantone R.J. (2000), Artiﬁcial neural network decision support
systems for new product developement project selection. Journal of Marketing Research;
37:543-558.
Vach W., Robner R., Schumacher M. (1996), Neural networks and logistic regression: Part I.
Computational Statistics and Data Analysis; 21:683-701.
Wang T., Zhuang X., Xing X. (1992), Robust segmentation of noisy images using a neural
network model. Image Vision Computing; 10:233-240.
Webb A.R., Lowe D., (1990), The optimized internal representation of multilayer classiﬁer
networks performs nonlinear discriminant analysis. Neural Networks; 3:367-375.
Werbos P.J., (1974), Beyond regression: New tools for prediction and analysis in the behav-
ioral sciences. Ph.D. thesis, Harvard University, 1974.
West D., (2000), Neural network credit scoring models. Computers and Operations Research;
27:1131-1152.
West P.M., Brockett P.L., Golden L.L., (1997), A comparative analysis of neural networks
and statistical methods for predicting consumer choice. Marketing Science; 16:370-391.
Widrow B., Hoff M.E., (1960), Adaptive switching circuits, 1960 IRE WESCON Convention
Record, New York: IRE Part 4 1960:96-104.
Widrow B., Rumelhart D.E., Lehr M.A., (1994), Neural networks: applications in industry,
business and science, Communications of the ACM; 37:93-105.
Wong B.K., Bodnovich T.A., Selvi Y., (1997), Neural network applications in business: A
review and analysis of the literature (1988-1995). Decision Support Systems; 19:301-
320.
Young S.S., Scott P.D., Nasrabadi, N.M., (1997), Object recognition using multilayer Hop-
ﬁeld neural network. IEEE Transactions on Image Processing; 6:357-372.
Zhang G.P., (2007), Avoiding Pitfalls in Neural Network Research. IEEE Transactions on
Systems, Man, and Cybernetics; 37:3-16.
Zhang G.P., Hu M.Y., Patuwo B.E., Indro D.C., (1999), Artiﬁcial neural networks in
bankruptcy prediction: general framework and cross-validation analysis. European Jour-
nal of Operational Research; 116:16-32.
Zhang G.P., Keil M., Rai A., Mann J., (2003), Predicting information technology project

escalation: a neural network approach. European Journal of Operational Research 2003;
146:115–129.
Zhang G.P., Qi M. (2002), “Predicting consumer retail sales using neural networks.” In Neu-
ral Networks in Business: Techniques and Applications, Smith, K. and Gupta, J.eds.
Hershey: Idea Group Publishing, 26-40.
Zhang G.P., Patuwo E.P., Hu M.Y., (1998), Forecasting with artiﬁcial neural networks: the
state of the art. International Journal of Forecasting; 14:35-62.
Zhang W., Cao Q., Schniederjans M.J., (2004), Neural Network Earnings per Share Fore-
casting Models: A Comparative Analysis of Alternative Methods. Decision Sciences;
35: 205–237.
Zhu Z., He H., Starzyk J.A., Tseng, C., (2007), Self-organizing learning array and its appli-
cation to economic and ﬁnancial problems. Information Sciences; 177:1180-1192.
22
Granular Computing and Rough Sets - An
Incremental Development
Tsau Young (’T. Y.’) Lin
1
and Churn-Jung Liau
2
1
Department of Computer Science
San Jose State University
San Jose, CA 95192

2
Institute of Information Science
Academia Sinica, Taipei 115, Taiwan

Summary. This chapter gives an overview and reﬁnement of recent works on binary granular
computing. For comparison and contrasting, granulation and partition are examined in parallel

from the prospect of rough Set theory (RST).The key strength of RST is its capability in
representing and processing knowledge in table formats. Even though such capabilities, for
general granulation, are not available, this chapter illustrates and reﬁnes some such capability
for binary granulation. In rough set theory, quotient sets, table representations, and concept
hierarchy trees are all set theoretical, while in binary granulation, they are special kind of
pretopological spaces, which is equivalent to a binary relation Here a pretopological space
means a space that is equipped with a neighborhood system (NS). A NS is similar to the
classical NS of a topological space, but without any axioms attached to it
3
.
Key words: Granular computing, rough set, binary relation, equivalence relation
22.1 Introduction
Though the label, granular computing is relatively recent, the notion of granulation
has in fact been appeared, under different names, in many related ﬁelds, such as pro-
gramming, divide and conquer, fuzzy and rough set theories, pretopological spaces,
interval computing, quantization, data compression, chunking, cluster analysis, be-
lief functions, machine learning, databases, and many others. In the past few years,
we have seen a renewed and fast growing interest in Granular Computing (GrC).
Many applications of granular computing have appeared in ﬁelds, such as medicine,
economics, ﬁnance, business, environment, electrical and computer engineering, a
number of sciences, software engineering, and information science.
3
This is an expansion of the article (Lin, 2005) in IEEE connections, the news letter of the
IEEE Computational Intelligence Society
O. Maimon, L. Rokach (eds.), Data Mining and Knowledge Discovery Handbook, 2nd ed.,
DOI 10.1007/978-0-387-09823-4_22, © Springer Science+Business Media, LLC 2010
446 Tsau Young (’T. Y.’) Lin and Churn-Jung Liau
Granulation seems to be a natural problem-solving methodology deeply rooted
in human thinking. Many daily ”things” have been routinely granulated into
sub”things;” human body has been granulated into head, neck, and so forth; geo-

graphic features into mountains, planes, and others. The notion is intrinsically fuzzy,
vague and imprecise. Mathematicians idealized it into the notion of partitions, and
developed it into a fundamental problem-solving methodology; it has played major
roles throughout the entire history of mathematics.
Nevertheless, the notion of partitions, which absolutely does not permit any over-
lapping among its granules, seems to be too restrictive for real world problems. Even
in natural science, classiﬁcation does permit small degree of overlapping; there are
beings that are both appropriate subjects of zoology and botany. A more general
theory is needed.
Based on Zadeh’s grand project on granular mathematics, during his sabbati-
cal leave (l996/l997) at Berkeley, Lin focused on a subset of granular mathematics,
which he called granular computing (Zadeh, 1998). To stimulate research on granu-
lar computing, a special interest group, with T. Y. Lin as its Chair, was formed within
BISC (Berkeley Initiative in Soft Computing). Since then, granular computing has
evolved into an active research area, generating many articles, books and presen-
tations at conferences, workshops and special sessions. This chapter is devoted to
present some of such development over the past few years.
There are two possible approaches: (1) One is starting from fuzzy side and mov-
ing down, and (2) the other one is from extreme crisp side and moving up. In this
chapter, we take the second approach incrementally. Recall that algebraically a parti-
tion is an equivalence relation, so a natural next step is the binary granulation deﬁned
by a binary relation. For contrasting, we may call a partition A-granulation and the
more general granulation B-granulation.
22.2 Naive Model for Problem Solving
An obvious approach to a large-scaled computing problem is: (1) To divide the prob-
lem into subtasks, might be point by point and level by level. (2) To elevate or abstract
the problem into concept/knowledge spaces, could be in multilevels. (3) To integrate
the solutions of subtasks and quotient tasks (knowledge spaces) of several levels
22.2.1 Information Granulations/Partitions
In the ﬁrst step, we select an appropriate system of granulation/partition so that

only the summaries of granules/equivalence classes may enter into the higher level
computing. The information in data space is transformed to a concept space, pos-
sibly in levels, which may be locally at each point or globally at eh whole uni-
verse (Lin, 2003b). Classically, we granulate by partitioning (no overlapping on gran-
ules). Such examples are plentiful: in mathematics (quotient groups, quotient rings
and etc. (Birkhoff and MacLane, 1977)), in theoretical computer science (divide-and-
conquer (Aho et al., 1974)), in software engineering (the structural, object oriented,
22 Granular Computing and Rough Sets - An Incremental Development 447
and component based design and programming (Szyperski, 2002)), in artiﬁcial intel-
ligence (Hobbs, 1985, Zhang and Zhang, 1992), in rough set theory (Pawlak, 1991)
among others. However, these are all partition based, where no overlapping of gran-
ules is permitted. As we have observed, even in biology, classiﬁcation does allow
some overlapping. The focus of this presentation will be on non-partition theory,but
only in an epsilon step away from partitioning method.
22.2.2 Knowledge Level Processing and Computing with Words
The information in each granule is summarized and the original problem is re-
expressed in terms of symbols, words, predicates or linguistic variables. Such re-
expressing is often referred to as knowledge representations. Its processing has been
termed computing with symbols (table processing, computing with words, knowl-
edge level processing, even precisiated natural language, depending on the complex-
ity of the representations.
In this chapter, we are computing on the space of granules or ”quotient space.” in
which each granule is represented by a word that carries different degree of seman-
tics. For partition theory, the knowledge representation is in table format (Pawlak,
1991) and its computation is syntactic in nature. For binary granulation, that we
have focused here, is semantic oriented. We expand and streamline the previous
works (Lin, 1998a,Lin, 1998b,Lin, 2000); the main idea is to transfer the computing
with words into computing with symbols.
Loosely speaking computing with symbols or symbolic computing is an “ax-
iomatic” Computing: all rules of computing symbols are determined by the axioms.

The computation follows the formal speciﬁcations. Such computing occurs only in
an ideal situation. In many real world applications, unfortunately, such as non-linear
computing, the formal speciﬁcations are often unavailable. So computing with words
are needed; it can be processed informally. Semantics of words often may not be
completely or precisely formalized. Their semantic computing is often carried out
in the systems with human helps (the semantics of symbols are not implemented).
Human enforced semantic computing are common in data processing environment.
22.2.3 Information Integration and Approximation Theory
Most applications require the solutions be presented in the same level as input data.
So the solutions often need to be integrated from subtasks (solutions in granules)
and quotient tasks (solutions in the spaces of granules). For some applications, such
as Data Mining and some rough set theory, are aimed at high level information;
in such cases this step can be skipped. In general, the integration is not easy. In
partition world, many theories have been developed in mathematics; e. g., extension
functors. The approximation theory of pretopological spaces and rough set theory
can be regarded as in this step.
448 Tsau Young (’T. Y.’) Lin and Churn-Jung Liau
22.3 A Geometric Models of Information Granulations
For understanding the general idea, in this section, we recall and reﬁne a previous
formalization in (Lin, 1998a). The goal is to formalize Zadeh’s informal notion of
granulation mathematically.
As original thesis is informal, the best we could do is to present, hopefully, con-
vincing arguments. We believe our formal theory is very close to the informal one.
According to Zadeh (1996):
Information granulation involves partitioning a class of objects(points) into
granules, with a granule being a clump of objects (points) which are drawn
together by indistinguishability, similarity or functionality.
We will literally take Zadeh’s informal words as a formal deﬁnition of granula-
tion. We observe that:
1. A granule is a group of objects that are draw together (by indistinguishability,

similarity or functionality).
The phrase ”drawn together” implicitly implies certain level of symmetry among
the objects in a granule. Namely, if p is drawn towards q, then q is also drawn
towards p.
Such symmetry, we believe, is imposed by imprecise-ness of natural language.
To avoid such an implications, we will rephrase it to ”drawn towards an object
p,” so that it is clear the reverse may or may not be true. So we have ﬁrst revision:
2. A granule is a group B(p) of objects that are draw toward an object p. Here p
varies through every object in the universe.
3. Such an association between object p and a granule B(p) induces a map from
the object space to power set of object space. This map has been called a binary
granulation (BG).
4. Geometric View:
We may use geometric terminology and refer to the granule as a neighborhood of
p, and the collection {B(p)} a binary neighborhood system (BNS). It is possible
that B(p) is an empty set. In this case we will simply say p has no neighborhood
(abuse of language; to be very correct, we should say p has an empty neighbor-
hood). Also it is possible that different points may have the same neighborhood
(granule) B(p)=B(q). The set of all q, where B(q) is equal to B(p), is called
the centers C(p) of B(
p).
5. Algebraic View:
Consider the set R = {(p, u)}, where u in B(p) and p in U. It is clear that R is a
subset of U ×U, hence deﬁnes a binary relation (BR), and vice versa.
Proposition 1 A binary neighborhood system (BNS), A binary granulation (BG),
and a binary relation (BR) are equivalent.
From the analysis given above, we propose the following mathematical model
for information granulation.
22 Granular Computing and Rough Sets - An Incremental Development 449
Deﬁnition 1 By a (single level) information granulation deﬁned on a set U we mean

a binary granulation (binary neighborhood system, binary relation) deﬁned on U.
Let us goes a little bit further. Note that the binary relation is a mathematical
expression of Zadeh’s ”indistinguishability, similarity or functionality.” We abstract
the three properties into a list of abstract binary relations {B
j
| j run through some
index set }, where each B
j
is a binary relation.
Note that at each point p, each B
j
induces a neighborhood B
j
(p). Some may
be empty, or identical. By removing empty set and duplications, the family have
been we re-indexed N
i
(p). As in the single level case, we will deﬁne directly the
granulation
N : U → 2
2
U
; p →{B
i
(p) | i run through some index set }.
The collection {B
i
(p)} is called a neighborhood system(NS)or (LNS); the latter
one is used to distinguish itself from the neighborhood system (TNS) of a topological
space (Lin, 1989a, Lin, 1992).

Deﬁnition 2 By a local multi-level information granulation deﬁned on U, we mean
a neighborhood system (NS) is deﬁned on U. By a global multi-level information
granulation deﬁned on U, we mean a set of BG is deﬁned on U.
All notions can be fuzziﬁed. The right way to look at this section is to assume
implicitly there is a modiﬁer ”crisp/fuzzy” to all notions presented above.
22.4 Information Granulations/Partitions
Technically, granular computing is actually computing with constraints. Especially
in “inﬁnite world”, granulation is often given in terms of constraints. In this chapter,
we concerns primarily with constraints that are mathematically represented as binary
relations
22.4.1 Equivalence Relations(Partitions)
Partition is a decomposition of the universe into a family of disjoint subsets. They are
called equivalence classes, because a partition induces an equivalence relation and
vice versa. In this chapter, we will view the equivalence class in a special way. Let
A ⊆U ×U be an equivalence relation (a reﬂexive, symmetric and transitive binary
relation). For each p, let
A
p
= {v ∈U : pAv} (22.1)
A
p
is the equivalence class containing p, and will be called A-granule for the purpose
of contrasting with general cases. Elements in A
p
are equivalent to each other. Let us
summarize the discussions in:
A : U →2
U
: p →A
p

(22.2)

Data Mining and Knowledge Discovery Handbook, 2 Edition part 47 potx

Tài liệu liên quan

Tài liệu bạn tìm kiếm đã sẵn sàng tải về