(IJCSIS) International Journal of Computer Science and Information Security, Vol. 7, No. 3, March 2010 Probabilistic Semantic Web Mining Using Artificial Neural Analysis 1.1 , Information, and Knowledge

1. T. KRISHNA KISHORE, 2. T.SASI VARDHAN, AND 3. N. LAKSHMI NARAYANA Abstract Most of the web user’s requirements are 1.1.1 Data search or navigation time and getting correctly

matched result. These constrains can be satisfied Data are any facts, numbers, or text that with some additional modules attached to the can be processed by a computer. Today, existing search engines and web servers. This paper organizations are accumulating vast and growing proposes that powerful architecture for search amounts of data in different formats and different engines with the title of Probabilistic Semantic databases. This includes: Web Mining named from the methods used.  Operational or transactional data such as, With the increase of larger and larger sales, cost, inventory, payroll, and accounting collection of various data resources on the World  No operational data, such as industry Wide Web (WWW), Web Mining has become one sales, forecast data, and macro economic data of the most important requirements for the web  Meta data - data about the data itself, such users. Web servers will store various formats of as logical database design or data dictionary data including text, image, audio, video etc., but definitions servers can not identify the contents of the data. 1.1.2 Information These search techniques can be improved by The patterns, associations, or relationships adding some special techniques including semantic among all this data can provide information. For web mining and probabilistic analysis to get more example, analysis of retail point of sale transaction accurate results. data can yield information on which products are Semantic web mining technique can provide selling and when. meaningful search of data resources by eliminating 1.1.3 Knowledge useless information with mining process. In this technique web servers will maintain Meta Information can be converted into information of each and every data resources knowledge about historical patterns and future available in that particular . This will trends. For example, summary information on retail help the search engine to retrieve information that supermarket sales can be analyzed in light of is relevant to user given input string. promotional efforts to provide knowledge of This paper proposing the idea of combing consumer buying behavior. Thus, a manufacturer these two techniques Semantic web mining and or retailer could determine which items are most Probabilistic analysis for efficient and accurate susceptible to promotional efforts. search results of web mining. SPF can be 1.1.4 Data Warehouses calculated by considering both semantic accuracy Dramatic advances in data capture, and syntactic accuracy of data with the input string. processing power, data transmission, and storage This will be the deciding factor for producing capabilities are enabling organizations to integrate results. their various databases into data warehouses. Data 1. Concept of warehousing is defined as a process of centralized

294 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 7, No. 3, March 2010

and retrieval. Data warehousing, analytical software are available: statistical, like data mining, is a relatively new term although , and neural networks. Generally, the concept itself has been around for years. Data any of four types of relationships are sought: warehousing represents an ideal vision of  Classes: Stored data is used to locate data in maintaining a central repository of all predetermined groups. For example, a restaurant organizational data. chain could mine customer purchase data to 1.2 What can data mining do? determine when customers visit and what they Data mining is primarily used today by typically order. This information could be used companies with a strong consumer focus - retail, to increase traffic by having daily specials. financial, communication, and marketing  Clusters: Data items are grouped according to organizations. It enables these companies to logical relationships or consumer preferences. determine relationships among "internal" factors For example, data can be mined to identify such as price, product positioning, or staff skills, market segments or consumer affinities. and "external" factors such as economic indicators,  Associations: Data can be mined to identify competition, and customer demographics. And, it associations. The beer-diaper example is an enables them to determine the impact on sales, example of associative mining. customer satisfaction, and corporate profits.  Sequential patterns: Data is mined to anticipate Finally, it enables them to "drill down" into behavior patterns and trends. For example, an summary information to view detail transactional outdoor equipment retailer could predict the data. likelihood of a backpack being purchased based With data mining, a retailer could use on a consumer's purchase of sleeping bags and point-of-sale records of customer purchases to send hiking shoes. targeted promotions based on an individual's Data mining process includes various internal purchase history. By mining demographic data operations as shown in Fig 1 from comment or warranty cards, the retailer could develop products and promotions to appeal to specific customer segments. For example, Blockbuster Entertainment mines its video rental history database to recommend rentals to individual Fig 1: Data Mining Process customers. American Express can suggest products 1.3.1 Data mining consists of five major to its cardholders based on analysis of their monthly expenditures. elements: 1.3 How does data mining work?  Extract, transform, and load transaction While large-scale information technology data onto the system. has been evolving separate transaction and  Store and manage the data in a analytical systems, data mining provides the link multidimensional database system. between the two. Data mining software analyzes  Provide data access to business analysts relationships and patterns in stored transaction data and information technology professionals. based on open-ended user queries. Several types of  Analyze the data by application software.

295 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 7, No. 3, March 2010

 Present the data in a useful format, such as  Data/information extraction: Our focus a graph or table. will be on extraction of structured data from Web 1.3.2 Different levels of analysis are pages, such as products and search results. available: Extracting such data allows one to provide services. Two main types of techniques, machine  Artificial neural networks: Non-linear learning and automatic extraction are covered. predictive models that learn through training and  Web information integration and resemble biological neural networks in structure. schema matching: Although the Web contains a  Genetic algorithms: Optimization techniques huge amount of data, each web site (or even page) that use process such as genetic combination, represents similar information differently. How to mutation, and natural selection in a design based identify or match semantically similar data is a on the concepts of natural evolution. very important problem with many practical  Decision trees: Tree-shaped structures that applications. Some existing techniques and represent sets of decisions. These decisions problems are examined. generate rules for the classification of a dataset.  Opinion extraction from online sources: Specific decision tree methods include There are many online opinion sources, e.g., Classification and Regression Trees (CART) and customer reviews of products, forums, blogs and Chi Square Automatic Interaction Detection chat rooms. Mining opinions (especially consumer (CHAID). CART and CHAID are decision tree opinions) is of great importance for marketing techniques used for classification of a dataset. intelligence and product benchmarking. We will They provide a set of rules that you can apply to introduce a few tasks and techniques to mine such a new (unclassified) dataset to predict which sources. records will have a given outcome. CART  Knowledge synthesis: Concept segments a dataset by creating 2-way splits while hierarchies or ontology are useful in many CHAID segments using chi square tests to create applications. However, generating them manually multi-way splits. CART typically requires less is very time consuming. A few existing methods data preparation than CHAID. that explores the information redundancy of the  Nearest neighbor method: A technique that Web will be presented. The main application is to classifies each record in a dataset based on a synthesize and organize the pieces of information combination of the classes of the k record(s) on the Web to give the user a coherent picture of most similar to it in a historical dataset (where k the topic domain.. 1). Sometimes called the k-nearest neighbor  Segmenting Web pages and detecting technique. noise: In many Web applications, one only wants  Rule induction: The extraction of useful if-then the main content of the Web page without rules from data based on statistical significance. advertisements, navigation links, copyright notices.  Data visualization: The visual interpretation of Automatically segmenting Web page to extract the complex relationships in multidimensional data. main content of the pages is interesting problem. A Graphics tools are used to illustrate data number of interesting techniques have been relationships proposed in the past few years. 1.3.3 Operations of Web Mining:

296 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 7, No. 3, March 2010

2. Types of Web Mining richness or perhaps the variety of topics covered in Web Mining is the extraction of the document. This can be compared to interesting and potentially useful patterns and bibliographical citations. When a paper is cited implicit information from artifacts or activity often, it ought to be important. The Page Rank and related to the . There are roughly CLEVER methods take advantage of this three knowledge discovery domains that pertain to information conveyed by the links to find pertinent web mining: Web Content Mining, Web Structure web pages. 2.3 Web Usage Mining Mining, and Web Usage Mining. Web content Web servers record and accumulate data mining is the process of extracting knowledge from about user interactions whenever requests for the content of documents or their descriptions. Web resources are received. Analyzing the web access document , resource discovery based on logs of deferent web sites can help understand the concepts indexing or agent based technology may user behavior and the web structure, thereby also fall in this category. Web structure mining is improving the design of this colossal collection of the process of inferring knowledge from the World resources. There are two main tendencies in Web Wide Web organization and links between Usage Mining driven by the applications of the references and referents in the Web. Finally, web discoveries: General Access Pattern Tracking and usage mining, also known as Web Log Mining, is Customized Usage Tracking. The general access the process of extracting interesting patterns in web pattern tracking analyzes the web logs to access logs.2.1 Web Content Mining understand access patterns and trends. These Web content mining is an automatic analyses can shed light on better structure and process that goes beyond keyword extraction. Since grouping of resource providers. Many web analysis the content of a text document presents no tools existed but they are limited and usually machine-readable semantic, some approaches have unsatisfactory. We have designed a web log data suggested to restructure the document content in a mining tool, Weblog Miner, and proposed representation that could be exploited by machines. techniques for using data mining and Online The usual approach to exploit known structure in Analytical Processing (OLAP) on treated and documents is to use wrappers to map documents to transformed web access files. Some scripts custom- some data model. Techniques using lexicons for tailored for some sites may store additional content interpretation are yet to come. There are information. However, for an effective web usage two groups of web content mining strategies: Those mining, an important that directly mine the content of documents and those that improve on the content search of other tools like search engines. 2.2 Web Structure Mining World Wide Web can reveal more information than just the information contained in g and data documents. For example, links pointing to a Fig 2 : Types of Web Mining document indicate the popularity of the document, 3. What is Semantic Web Mining? while links coming out of a document indicate the

297 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 7, No. 3, March 2010

The effort behind the Semantic Web is to 3.2 Mapping and Merging Ontologies add semantic annotation to Web documents in With the growing usage of ontologies, the order to access knowledge instead of unstructured problem of overlapping knowledge in a common material, allowing knowledge to be managed in an domain occurs more often and becomes critical. automatic way. Web Mining can help to learn Domain-specific ontologies are modeled by definitions of structures for knowledge multiple authors in multiple settings. These organization (e. g., ontologies) and to provide the ontologies lay the foundation for building new population of such knowledge structures. domain-specific ontologies in similar domains by All approaches discussed here are semi- assembling and extending multiple ontologies from automatic. They assist the knowledge engineer in repositories. The process of ontology merging takes extracting the , but cannot completely as input two (or more) source ontologies and replace her. In order to obtain high-quality results, returns a merged ontology based on the given one cannot replace the human in the loop, as there source ontologies. Another method is FCA-Merge is always a lot of tacit knowledge involved in the which merges ontologies following a bottom-up modeling process. A computer will never be able to approach, offering a global structural description of fully consider background knowledge, experience, the process. D or social conventions. 4. Architecture of Semantic Web Miner 3.1 Ontology Learning: In this semantic web miner architecture Extracting ontology from the Web is a challenging includes various components to perform syntactic task. One way is to engineer the ontology by hand, as well as semantic analysis on the web resources but this is quite an expensive way. The expression for the given input string. At the same time each Towards Semantic Web Mining Ontology Learning web server must contains one special module Meta was coined for the semi-automatic extraction of Information Analyzer. semantics from the Web in order to create ontology 5. Features of Semantic Web Miner the techniques produce intermediate results which Semantic web Miner Marjory considers must the concept of time complexity of searching. In general search engines will need to spend more time in searching the web results by simple matching of given input string with the contents of each and every web resource

Fig 3: Ontology Mapping and Merging finally be integrated in one machine- understandable format, ex: ontology. F Fig 4: Semantic Web Miner

298 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 7, No. 3, March 2010

6. What is Neural Network? performance. However, some network An Artificial Neural Network (ANN) is an capabilities may be retained even with major information processing paradigm that is inspired by network damage. the way biological nervous systems, such as the 6.2 How the Human Brain Learns? brain, process information. The key element of this In the human brain, a typical neuron paradigm is the novel structure of the information collects signals from others through a host of fine processing system. It is composed of a large structures called dendrites. The neuron sends out number of highly interconnected processing spikes of electrical activity through a long, thin elements (neurons) working in unison to solve stand known as an axon, which splits into specific problems. ANNs, like people, learn by thousands of branches. At the end of each branch, a example. structure called a synapse converts the activity 6.1 Why use neural networks? from the axon into electrical effects that inhibit or Neural networks, with their remarkable ability to excite activity from the axon into electrical effects derive meaning from complicated or imprecise that inhibit or excite activity in the connected data, can be used to extract patterns and detect neurons. When a neuron receives excitatory input trends that are too complex to be noticed by either that is sufficiently large compared with its humans or other computer techniques. inhibitory input, it sends a spike of electrical activity down its axon.

Fig 5: Components of a neuron Other advantages include: 1. Adaptive learning: An ability to learn how to Fig 6: Synapse do tasks based on the data given for training or Learning occurs by changing the effectiveness of initial experience. the synapses so that the influence of one neuron on 2. Self-Organization: An ANN can create its own another changes. organization or representation of the 6.3 From Human Neurons to Artificial information it receives during learning time. Real Time Operation: ANN computations may be Neurons carried out in parallel, and special hardware devices are being designed and 3. manufactured which take advantage of this capability. 4. Fault Tolerance via Redundant Information Coding: Partial destruction of a network leads Fig 8: Neuron Structure to the corresponding degradation of

299 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 7, No. 3, March 2010

We conduct these neural networks by first The addition of input weights and of the trying to deduce the essential features of neurons threshold makes this neuron a very flexible and and their interconnections. powerful one. The MCP neuron has the ability to 7. Components of Artificial Neural adapt to a particular situation by changing its Analyzer weights and/or threshold. Various algorithms exist that cause the neuron to 'adapt'; the most used ones 7.1 A simple neuron are the Delta rule and the back error propagation. The former is used in feed-forward networks and the latter in feedback networks. 7.3 Feed-forward networks Feed-forward ANNs allow signals to travel one way only; from input to output. There is no feedback (loops) i.e. the output of any layer Fig 8: Simple Neuron does not affect that same layer. Feed-forward An artificial neuron is a device with many inputs ANNs tend to be straight forward networks that and one output. The neuron has two modes of associate inputs with outputs. They are extensively operation; the training mode and the using mode. In used in . This type of the training mode, the neuron can be trained to fire organization is also referred to as bottom-up or top- (or not), for particular input patterns. down. 7.2 A more complicated neuron The previous neuron doesn't do anything that conventional computers don't do already. A more sophisticated neuron is the McCulloch and Pitts model (MCP). The difference from the previous model is that the inputs are ‘weighted’; the effect that each input has at decision making is

Fig 10: General Neural Network 7.3.1 Network layers The commonest type of artificial neural network consists of three groups, or layers, of units: a layer of "input" units is connected to a Fig 9: Complicated Neuron layer of "hidden" units, which is connected to a dependent on the weight of the particular input. layer of "output" units. The weight of an input is a number which when  The activity of the input units represents the multiplied with the input gives the weighted input. raw information that is fed into the network. In any other case the neuron does not fire. In  The activity of each hidden unit is determined mathematical terms, the neuron fires if and only if; by the activities of the input units and the X1W1 + X2W2 + X3W3 + ... > T

300 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 7, No. 3, March 2010

weights on the connections between the input excitatory connection whereas a negative weight and the hidden units. represents an inhibitory connection  The behavior of the output units depends on the activity of the hidden units and the weights between the hidden and output units. This simple type of network is interesting because the hidden units are free to construct their own representations of the input. The weights between the input and hidden units determine when Fig 11: Back Propagation Neural Network each hidden unit is active, and so by modifying A unit in the output layer determines its activity by these weights, a hidden unit can choose what it following a two step procedure. represents.  First, it computes the total weighted input xj, We also distinguish single-layer and using the formula: multi-layer architectures. The single-layer organization, in which all units are connected to one another, constitutes the most general case and Where yi is the activity level of the jth unit in the is of more potential computational power than previous layer and Wij is the weight of the hierarchically structured multi-layer organizations. connection between the ith and the jth unit. 7.4 Feedback networks  Next, the unit calculates the activity yj using Feedback networks can have signals some function of the total weighted input. traveling in both directions by introducing loops in Typically we use the sigmoid function: the network. Feedback networks are very powerful and can get extremely complicated. Feedback networks are dynamic; their 'state' is changing  Once the activities of all output units have been continuously until they reach an equilibrium point. determined, the network computes the error E, 7.5 Back-propagation Algorithm - a which is defined by the expression: mathematical approach Units are connected to one another.

Connections correspond to the edges of the Where yj is the activity level of the jth unit in the underlying directed graph. There is a real number top layer and dj is the desired output of the jth unit. associated with each connection, which is called 7.5.1 The back-propagation algorithm the weight of the connection. We denote by Wij the consists of four steps: weight of the connection from unit ui to unit uj. It 1. Compute how fast the error changes as the is then convenient to represent the pattern of activity of an output unit is changed. This error connectivity in the network by a weight matrix W derivative (EA) is the difference between the actual whose elements are the weights Wij. Two types of and the desired activity. connection are usually distinguished: excitatory and inhibitory. A positive weight represents an

301 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 7, No. 3, March 2010

2. Compute how fast the error changes as the total  There is no need to assume an underlying data input received by an output unit is changed. This distribution such as usually is done in statistical quantity (EI) is the answer from step 1 multiplied modeling. by the rate at which the output of a unit changes as  Neural networks are applicable to multivariate its total input is changed. non-linear problems.  The transformations of the variables are automated in the computational process

3. Compute how fast the error changes as a weight 9. Components of Probabilistic on the connection into an output unit is changed. Semantic Web Mining This quantity (EW) is the answer from step 2 This paper proposes the idea of combining multiplied by the activity level of the unit from the efficient methods those can make efficient for which the connection emanates. the web mining. For obtaining this effect, search engine should mainly contain three components:  Semantic Web Miner,  Artificial Neural Analyzer, & 4. Compute how fast the error changes as the activity of a unit in the previous layer is changed. It  Result Formatter. is the answer in step 2 multiplied by the weight on 9.1 Semantic Web Miner the connection to that output unit. As we have already seen about working of general semantic web mining module. This semantic web miner will include: Syntactic Parser, Semantic Analyzer and Semantic Result Generator. By using steps 2 and 4, we can convert the EAs of This miner performs first level of web results one layer of units into EAs for the previous layer. analysis. This procedure can be repeated to get the EAs for 9.1.1 Syntactic Parser: it is simple parser that as many previous layers as desired. Once we know reads the given input string, and converts to the EA of a unit, we can use steps 2 and 3 to corresponding syntax tree. This syntax tree can be compute the EWs on its incoming connections. used for calculating the syntactic matched results. 8. Features of Artificial Neural Analyzer Large collection of web results will obtain with this Depending on the nature of the application simple syntactic parser. Most of these results may and the strength of the internal data patterns you not useful, so it will be sending for semantic can generally expect a network to train quite well. analysis. This applies to problems where the relationships 9.1.2 Semantic Analyzer: this analyzer will may be quite dynamic or non-linear. ANNs provide verify the semantic details of each and every result an analytical alternative to conventional techniques obtained in the syntactic parser. This analyzer will which are often limited by strict assumptions of verify the semantics by contacting the web server normality, linearity, variable independence etc. Meta information details of the web resources. Some of the major advantages of Artificial Neural Analyzer are: 9.1.3 Semantic Result Generator: this component is useful for finalizing the semantically

302 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 7, No. 3, March 2010

matched web result from the list of syntactically matched web results. Even though these results 10. Conclusion may contains some of the un wanted results in The idea proposed in this paper can give miner levels. powerful architecture for the search engines in web 9.2 Artificial Neural Analyzer (ANA): mining. Semantic web miner used in this This will be the second level of web architecture will help in reducing the time required analysis. This analyzer contains various for searching results. Probabilities attached to the components required for producing Artificial semantic results can be used to reduce number or Neural Network effect. Along with that artificial count of un wanted result. With this combined neural network, it also included with Probabilities techniques will give more accurate results of web attacher. This probabilities attacher will add resources with in less time. This paper idea that probability labels for each and every semantic attached to search engines will produce only result that are obtained from the ANA.. relevant information in order to reduce burden of 9.3 Result Formatter: un wanted results for the web user request to same This formatter contains various methods their valuable search time. of representing the output result. Since probabilities 11. References play major role in these analyses, it must sort out [1] B. Berendt. Using site semantics to analyze, the web results according to its probabilities order. visualize and support navigation. Data Mining and Those sorting algorithms must be included in this Knowledge Discovery, 6:37–59, 2002. formatter at the same time it must also include ]2] B. Berendt and M. Spiliopoulou. Analysing probability cut off analysis in order to restrict the navigation behaviour in web sites integrating low probability web results. multiple information systems. The VLDB Journal, 9(1):56–75, 2000. [3] S. Chakrabarti. Data mining for : Atutorial survey. SIGKDD Explorations, 1:1–11, 2000. [4] S. Chakrabarti, B. Dom, D. Gibson, J. Kleinberg, P. Raghavan, and S. Rajagopalan. [5] S. Chakrabarti, M. van den Berg, and B. Dom. Fig 12: Probabilistic Semantic Web Miner with [6] Hans Chalupsky. Ontomorph: Atranslation Artificial Neural Analysis Architecture system for symbolic knowledge. When if a web user requested to search a string [7] G. Chang, M.J. Healey, J.A.M. McHugh, and “semantic web mining”, then syntactic analyzer J.T.L.Wang. Mining the World Wide Web. An will construct syntax tree with three leaf nodes Information Search Approach. labeled ”semantic”, “web”, and “mining”. Then [8] E.H. Chi, P. Pirolli, and J. Pitkow. The scent of search for the matching on the web server. a site: a system for analyzing and predicting information scent, usage, and usability of a web site.

303 http://sites.google.com/site/ijcsis/ ISSN 1947-5500 (IJCSIS) International Journal of Computer Science and Information Security, Vol. 7, No. 3, March 2010

AUTHORS 1. T. KRISHNA KISHORE, working as an Asst. Professor, Dept. of CSE, in ST. ANN’S COLLEGE OF ENGINEERING & TECHNOLOGY,CHIRALA-523187,PRAKASAM (DT.), ANDHRA PRADESH, INDIA. His Interested Areas are COMPUTER NETWORKS, INFORMATION SECURITY, THEORY OF COMPUTATION, COMPILER DESIGN, EMBEDDED SYSTEMS….. Mobile: + 91 – 9885406051 E-mail: [email protected] 2. T. SASI VARDHAN, working as an Asst. Professor, Dept. of IT, in ST. ANN’S ENGINEERING COLLEGE, CHIRALA- 523187,PRAKASAM (DT.), ANDHRA PRADESH, INDIA. His Interested Areas are COMPUTER NETWORKS, INFORMATION SECURITY, DATA MINING & WAREHOUSING, and SOFTWARE ENGINEERING….. Mobile: +91 – 9959859922 E-m ail: [email protected] 3. N. LAKSHMI NARAYANA, working as an Asst. Professor, Dept. of CSE, in ST. ANN’S COLLEGE OF ENGINEERING & TECHNOLOGY,CHIRALA-523187,PRAKASAM (DT.), ANDHRA PRADESH, INDIA. His Interested Areas are COMPUTER NETWORKS, WEB DESIGNING, COMPILER DESIGN, TESTING METHODOLOGIES….. Mobile: +91 – 9030136208 E-mail: [email protected]

304 http://sites.google.com/site/ijcsis/ ISSN 1947-5500