European Journal of Research www.journalofresearch.de ¹ 1/2019 [email protected]

SOCIAL SCIENCE AND HUMANITIES

Manuscript info: Received December 12, 2018., Accepted December 17, 2018., Published January 20, 2019.

DATA MINING: A UTILITY MODEL

Pardeep Rattan Government College, Phase VI, SahibzadaAjit Singh Nagar (Mohali)- Punjab [email protected]

http://dx.doi.org/10.26739/2521-3253-2019-1-4

Abstract: Documents, Users, Services, Finances, Human resources and Space are to library, what library is to an institution. Library perhaps is the only service agency that through its robust and helpful library system, based on tools and techniques of information and communication technology is able to satisfy the information needs of a user across the globe 24x7x365 that comes out to be 61320 hours in an year. through (DM)techniques would be able to strengthen its managerial and where data is analysed from different perspectives which in turn would provide an edge to an organisation like library to serve their clients in a better way. An attempt has been made through this conceptual paper to identify the core library areas where data mining techniques can be applied to build a stronger serviceable library system for the maximum benefit of library users. Keywords: Data Mining, ICT, Electronic Libraries, Data Knowledge Discovery, Data Warehousing.

Recommended citation: Pardeep Rattan. DATA MINING: A LIBRARY UTILITY MODEL. 1 European Journal of Research P. 39-45 (2019).

INTRODUCTION Information communication technology tools and techniques have influenced and transformed all the human activities across the globe and have forced the social, economic, education, health, agriculture, weather, scientific and research organizations world over to think, devise, implement and serve the humanity in the most befitting way by implementing them. The indelible impact and benefits of ICT in libraries and information centres has in fact made the availability of filtered, accurate, tailored and timely information to the users in digital form via , and in-house . Information of any kind and in any format is vital to the growth, development and expansion of all parameters of human life. The selection of information sources and their format in any library is the

Vienna, Austria Generalization of Scientific Results

39 European Journal of Research www.journalofresearch.de ¹ 1/2019 [email protected] direct outcome of clear cut well defined policy - the development policy which is formulated by keeping in mind the aim, mission, vision and target audience of the parent institution. Data mining helps in the management of data from the reports generated from different sections of the library that would lay a for collectiondevelopment, budgetary provisions, usage of resources, thrust areas, weaknesses, clientele interest, timings, human resources where the library has to work upon to continue or to modify the set structure to strengthen the current library system. It would also help the library governing body for a long term and sustained library development programme. DATA AND INFORMATION Data and information are the terms used interchangeably in routine practices. However, the thin line between these two terms is identifiable by the information specialists and the managers who manage the data to convert it into information for the end user through processing using IT techniques in today's electronic age. Data are recorded facts, events, and transactions and so on. They act as input material from which information as per need is generated based on the data. In other words information is analyzed data useful for the user. In clearer terms and to be more specific, Edward and Finlay has put it " without an efficient means of filtering and aggregating data a manager could be... data rich yet information poor." The characteristics of time, accuracy, relevance, completeness, understanding and verification make information complete information. A complete helps in decision making by virtue of its nature of clarity, uncertainty and monitoring and control. Computer Based Library Information System (CBLIS) use electronic aids, tools and technology to generate relevant information which can be understood from following figure:-

It has been said that recipe for a good decision is "90 percent information and 10 percent inspiration."Information acts as a catalyst for the managerial functions of planning, operating and controlling.

Vienna, Austria Generalization of Scientific Results

40 European Journal of Research www.journalofresearch.de ¹ 1/2019 [email protected]

Norbert Wiener has put it like this," any organism is held together by the possession of means for the acquisition, use, retention and transmission of information." DATA MINING (DM) and LIBRARIES Data Mining (DM) is a computational process which helps to discover different patterns among large data through (knowledge discovery) and , (data analysis) and systems. It is also called Data Knowledge Discovery (DKD) where data is analyzed to derive meaningful and useful information from different perspectives as per requirement, future planning and to seek present trends of the parent organization.It may be used in libraries to find useful but undiscovered pattern in a large collected data. It identifies the hidden patterns or distinct characters which otherwise were unknown or hidden that may be helpful for making decisions. Online dictionary of library and information science (ODLIS) defines data mining as "the process of using database applications to identify previously undetected patterns and relationships within an existing set of data." More and more organisations today are depending on data mining analysis for making targeted business decisions because of the following Capabilities of DM:- ^ The required extensive manual workouts on of data analysis are moderated to a greater extent into automated prediction of trendsand behaviours which helps in targeted marketing, future trends prediction and in identification of a particular set of clients using a particular service and product. ^ The different tools employed in DM through automated discovery of previously unknown designs or normsor trendshelp to identify the hidden patterns. It can be best understood by an example where in a library scenario a particular group of library users ask for a specific information resource or a subject specific consultation. In a report for the University Libraries at Virginia Tech., Young & others (2017) advocated that academic should playpro active role as facilitators of text and data mining (TDM) to identify and review the research trends and to study the undetected data relationships. The academic libraries have traditionally been a source for peer reviewed research and other scholarly literature. Prakash, Prem Chand and Goyal in a study on Application of DM in Library and Information Science have strongly put a case in favour of DM for libraries for better decision making by connecting segregated sources of data which many libraries in their day to day operations. Uppal and Chindwani (2013) suggested that through DM techniques related on the same stream can be suggested to readers for further

Vienna, Austria Generalization of Scientific Results

41 European Journal of Research www.journalofresearch.de ¹ 1/2019 [email protected] reading, the arrangement of books and other documents can be altered based on frequent sequences of search and circulation and the readers interest can be gauzed by association rule of DM. Chen (2013) in a conference paper on study of DM in digital libraries put forward a model of DM for digital libraries. He advocated that DM helps to optimise automatic information processing, improves information quality and business with reduced costs and provides a direction for strengthening digital library collection. Carnegie Mellon University and Georgetown University through text and data mining on a project have recreated the British Early Modern Social network to trace the personal relationships among popular personalities form literature and science like Bacon, Shakespeare, Isaac Newton and many more. Some of the projects at the University also analyses the visual data along with the textual data. In her study on the "Application of Data mining Technology in Digital Library" Zhang (2011)had taken up a case study of "low utilization readers" at Linyi University Library in which according to the nature of behaviour of readers of high and low borrowing rate the data was mined and analyzed to establish factors for their low borrowing. It could provide help in decision making for library to manage and develop appropriate strategies to attract readers and also to predict reader use in the strategy. The author was of the view that DM technology in libraries would usher into a new realm for faster development of librarianship and create good social benefits.

Prediction Behavioural patterns of future within data Identification Current norms, activities, items, focus Classification Categorisation of data into different sets based on different parameters Optimisation Extent of finances, time, space, materials etc. For optimisation of output within limitations of an organisation GOALS OF DATA MINING The following graphic representationhelps to understand the process of data mining -

Vienna, Austria Generalization of Scientific Results

42 European Journal of Research www.journalofresearch.de ¹ 1/2019 [email protected]

Data is gathered by observation, communication, reading, daily records or reports of an organisation or a business. For a library number of total collection of documents, types and format of documents, size of organisation in terms of users, timings, transactions, reference transactions, research interest of users, abstracting services, SDI interests of users and all other records act as data. DM techniques normally are applied to decision making in business and corporate environments involving massive data such as marketing, finance, manufacturing, health, insurance, banking and so on. DATA MINING AND KNOWLEDGE DISCOVERIES IN LIBRARIES The new rules, norms, patterns, decisions, focussed areas, networks which are the resultant of DM techniques give birth to new knowledge which otherwise was a hidden fact up to certain extent. Following new types of information may be discovered from and for the different sections of the library:- 1. Association Rules: The database in this case is regarded as a collection of transactions, each involving a set of item. It can be understood by an example that if a reader comes to library for issuing a particular document, the reader may consult the reference section for the same information in a related or associated document and that document can be encyclopaedia, journal, conference proceeding, report or any other such document. 2. Classification Trees: It is the process of learning a model that describes different classes of data.The existing set of events is classified into different hierarchies among same classes. For example, student population may be divided into several classes or ranges for the circulation section representing a particular department or their course of study based on the previous transactions the students made in the circulation section. A particular student may also be classified by the frequency of library visits, by the types of documents and services the student used and so on. 3. Sequential Patterns: The sequential patterns define actions or events such as the documents issued or consulted by the library users during first semester of their course of study. It leads to prediction regarding a user or set of users for the second semester based on the approach of the user for using library in the first semester. 4. Patterns within Time Series: This rule helps to gather data within time series such as the behavioural pattern or usage of library by the user on daily, weekly or monthly basis. This can be helpful for the libraries working in consortium or on the theory of inter library loan or in networking depicting the pattern of transactions of particular document or service in a timeframe. 5. Clustering: From an identified population, items or events segments of similar items can be created. The segmented sets are different

Vienna, Austria Generalization of Scientific Results

43 European Journal of Research www.journalofresearch.de ¹ 1/2019 [email protected] clusters having similar values or records. For example the user population is categorised into groups like "least likely to visit" and for documents "least preferred" to "most preferred" for different subjects and students. The range of students or users may be further segmented to their social and economic backgrounds such as rural or urban and income groups and so on. DATA MINING AND LIBRARY UTILITY MODEL The preceding discussion shows that data mining rules can help to establish a stronger library system through mechanism of data analysis from different perspectives. The application of DM techniques certainly influences policy decisions across all sections of library for a holistic library growth: 1. Document selection policy or acquisition policy can be amended as per the results of data analysis giving emphasis on certain subject areas and streams and curtailing others along with the budgetary provisions. 2. Circulation thrust areas are identified. 3. The analysis of data of research interests of the users help to build a strong collection in different formats of reference and research section of the library. 4. Financial provisions and human resources of the library along with the library hours as per the new patterns may be arranged. 5. Aggressive marketing policy of library and information services and products can be undertaken for optimal use of library resources. 6. The weaknesses and strengths of library for future planning and growth can be identified. 7. Behavioural pattern of library personnel towards their users can be moulded based on intensive feedback from the readers. 8. The usability and demand survey results help for introducing new techniques, technologies and innovations. 9. Improvised library consortia and networking of library is possible that would cater to the information needs of general reader, specific content reader and so on. CONCLUSION Data mining techniques which are a summation of statistical techniques, and machine learning helps data analysis from and for different slants of an organisation which ultimately give birth to data knowledge discovery. Libraries which are perennial sources of provision of information in different ways may amend the way of delivery of information to their end users by predicting and identifying the behaviour and needs of users where data classification or data facets' categorisation form a solid base for overall growth of any library.

Vienna, Austria Generalization of Scientific Results

44 European Journal of Research www.journalofresearch.de ¹ 1/2019 [email protected]

References

1. Bin Chen (2013). Data mining in digital Libraries. In Yang Y & Man, Lin (eds.), Information and Applications. ICICA 2013. Communications in computer and Information Science, vol. 392, Springer, Berlin, Heidelberg. 282-291. Retrieved from https://doi.org/10.1007/978-3-642-53703-5_30 2. Data Mining [Def.]. (n.d.) ODLIS. In Online dictionary of library and information science. Retrieved December 17, 2017 from www.abc-clio.com/ODLIS/odlis-A.aspx 3. https://www.library.cmu.edu/research/tdm/overview 4. Lucy, Terence. (2005). Management information systems. Retrieved from https:// books.google.co.in/books?isbn=1844801268 5. Murdick, Robert G, Ross, Joel E &Claggett, James R. (2002). Information systems for modern management, 3rd ed., New Delhi: PHI. 6. Prakash K, Prem Chand &GoyalUmesh. "Application of Data Mining in Library and Information Science". Retrieved December 17, 2017 from ir.inflibnet.ac.in:8080/ir/ bitstream/1944/435/1/04Planner-22.pdf 7. Uppal, Veepu&ChindwaniGunjan. (2013). An Empirical Study of Application of Data Mining Techniques in Library System. International Journal of Computer Applications, 74(11), 42-46. research.ijcaonline.org/Vol. 74/ No.11 8. Wiener, Norbert. (1948). Cybernetics. NY: John Wiley 9. Young, P & Others (2017). Library Support for Text and Data Mining: A Report for the University Libraries at Virginia Tech. Retrieved from https:// vtechworks.lib.vt.edu/ bitstream/handle/10919/78466/TDMreport.pdf/sequence=5 10. Zhang, Mei (2011). Application of Data mining technology in digital library.Journal of Computers, 6(4), 761-768. https://pdfs.semanticscholar.org

Vienna, Austria Generalization of Scientific Results

45