INTERNATIONAL JOURNAL OF INFORMATION AND COMPUTING SCIENCE ISSN NO: 0972-1347

Survey on Recent Tools, Platforms and Interface

1.Mr.S.Karthi, M.E,AP/IT, St.Joseph College of Engineering, Chennai, [email protected] 2.Mrs.R.Raja Ramya, M.E.(Ph.D),AP/IT, Saveetha Engineering College, Chennai, ramyanitha @ gmail.com 3.Mrs.N.Kanagavalli, M.E.(Ph.D),AP/IT, Saveetha Engineering College, Chennai, [email protected]

ABSTRACT- Machine learning and deep in the process of working through a machine learning are recent booming research areas. Tools are learning problem a big part of machine learning and choosing the right tool can be as important as working with the Why Use Tools? best algorithms. So that the new researchers are not aware of the tools and frameworks are helped to Machine learning tools make applied machine machine learning. There are lots of open source tools learning faster, easier and more fun. and frameworks are available in machine learning and deep learning. In this paper, various open source Faster: This means that the time from ideas to machine learning tools and frameworks are explained. results is greatly shortened. The alternative is that you have to implement each capability Keywords-open source tools, frameworks, platforms yourself.

Introduction Easier: The open source tools are very easier to learn by individual. Based on the input, we A programming tool or software development tool is form the output in an easiest way. a computer program that software developers use to

create, debug, maintain, or otherwise support other This requires research, deeper exercise in order programs and applications. to understand the techniques, and a higher level The term usually refers to relatively simple programs, of engineering to ensure it is implemented that can be combined together to accomplish a task, efficiently. much as one might use multiple hand tools to fix a physical object. Fun: Learning open source tools are like playing a game. The tools are really helped to The most basic tools are a source code editor and beginners to get good results. You can use the a compiler or interpreter, which are used ubiquitously extra time to get better results or work on more and continuously. Other tools are used more or less projects. depending on the language, development methodology, and individual engineer, and are often Good versus Great Tools used for a discrete task, like a debugger or profiler. You want to use the best tools for the problems that Machine learning engineers are part of the you are working on. How to do tell the difference engineering team who build the product and the between good and great machine learning tools? algorithms, making sure that it works reliably, quickly, and at-scale. They work closely with Intuitive Interface: Great machine learning data scientist to understand the theoretical and tools provide an intuitive interface onto the sub- business aspect of it. tasks of the applied machine learning process. There’s a good mapping and suitability in the Machine learning tools are not just interface for the task. implementations of machine learning algorithms. They can be, but they can also provide capabilities that you can use at any step

Volume 6, Issue 6, June 2019 673 http://ijics.com INTERNATIONAL JOURNAL OF INFORMATION AND COMPUTING SCIENCE ISSN NO: 0972-1347

Best Practice: Great machine learning tools A platform provides all you need to run a embody best practices for process, project, whereas a library only provides discrete configuration and implementation capabilities or parts of what you need to complete a project. Examples include automatic configuration of machine learning algorithms and good process This is not a perfect distinction because some built into the structure of the tool. machine learning platforms are also libraries or some libraries provide a graphical user Trusted Resource: Great machine learning interface. tools are well maintained, updated frequently and have a community of people around it. List of frameworks for machine When to Use Machine Learning Tools learning Machine learning tools can save our time and help you consistency deliver good results across 1. Apache Singa is a general distributed deep projects. Some examples of when you may get learning platform for training big deep the most benefit from using machine learning learning models over large datasets. It is tools include: designed with an intuitive programming model based on the layer abstraction. Getting Starting: A variety of popular deep learning models are When you are just getting started machine supported, namely feed-forward models learning tools guide you through the process of including convolutional neural networks delivering good results quickly and give you (CNN), energy models like restricted confidence to continue on with your next Boltzmann machine (RBM), and recurrent project. neural networks (RNN). Many built-in layers are provided for users. Day-to-Day: When you need to get good results to a question quickly machine learning 2. Amazon Machine Learning is a service tools can allow you to focus on the specifics of that makes it easy for developers of all skill your problem rather than on the depths of the levels to use machine learning technology. techniques you need to use to get an answer. Amazon Machine Learning provides visualization tools and wizards that guide Project Work: When you are working on a you through the process of creating machine large project, machine learning tools can help learning (ML) models without having to you to prototype a solution, figure out the learn complex ML algorithms and requirements and give you a template for the technology. system that you may want to implement. It connects to data stored in Amazon S3, Platforms versus Libraries Redshift, or RDS, and can run binary There are a lot of machine learning tools. classification, multiclass categorization, or Enough that a google search can leave you regression on said data to create a model. feeling overwhelmed. One useful way to think about machine learning tools it so separate them 3. Azure ML Studio allows Microsoft Azure into Platforms and Libraries. users to create and train models, then turn them into APIs that can be consumed by other services.

Volume 6, Issue 6, June 2019 674 http://ijics.com INTERNATIONAL JOURNAL OF INFORMATION AND COMPUTING SCIENCE ISSN NO: 0972-1347

extend the platform seamlessly into your Users get up to 10GB of storage per account Hadoop environments. for model data, although you can also connect your own Azure storage to the 7. Massive Online Analysis (MOA) is the service for larger models. A wide range of most popular open source framework for algorithms are available, courtesy of both data stream mining, with a very active Microsoft and third parties. growing community.

You don’t even need an account to try out It includes a collection of machine learning the service; you can log in anonymously algorithms (classification, regression, and use Azure ML Studio for up to eight clustering, outlier detection, concept drift hours. detection and recommender systems) and tools for evaluation. Related to the WEKA 4. Caffe is a deep learning framework made project, MOA is also written in Java, while with expression, speed, and modularity in scaling to more demanding problems. mind. It is developed by the Berkeley Vision and Learning Center (BVLC) and by 8. MLlib (Spark) is Apache Spark’s machine community contributors. learning library. Its goal is to make practical machine learning scalable and easy. 5. Yangqing Jia created the project during his PhD at UC Berkeley. Caffe is released It consists of common learning algorithms under the BSD 2-Clause license. Models and utilities, including classification, and optimization are defined by regression, clustering, collaborative configuration without hard-coding & user filtering, dimensionality reduction, as well can switch between CPU and as lower-level optimization primitives and GPU. Speed makes Caffe perfect for higher-level pipeline APIs. research experiments and industry deployment. Caffe can process over 60M 9. mlpack, a C++-based machine learning images per day with a single NVIDIA K40 library originally rolled out in 2011 and GPU. designed for “scalability, speed, and ease- of-use,” according to the library’s creators.

6. H2O makes it possible for anyone to easily Implementing mlpack can be done through apply math and predictive analytics to solve a cache of command-line executable for today’s most challenging business quick-and-dirty, “black box” operations, or problems. with a C++ API for more sophisticated work.

It intelligently combines unique features not mlpack provides these algorithms as simple currently found in other machine learning command-line programs and C++ classes platforms including: Best of Breed Open which can then be integrated into larger- Source Technology, Easy-to-use WebUI scale machine learning solutions. and Familiar Interfaces, Data Agnostic Support for all Common Database and File 10. Pattern is a web mining module for the Types. Python programming language.

With H2O, you can work with your existing It has tools for (Google, Twitter languages and tools. Further, you can and Wikipedia API, a web crawler, a

Volume 6, Issue 6, June 2019 675 http://ijics.com INTERNATIONAL JOURNAL OF INFORMATION AND COMPUTING SCIENCE ISSN NO: 0972-1347

HTML DOM parser), natural language Graphs can be assembled with C++ or Python processing (part-of-speech taggers, n-gram and can be processed on CPUs or GPUs. search, sentiment analysis, WordNet), machine learning (vector space model, 14. Theano is a Python library that lets you to clustering, SVM), network analysis and define, optimize, and evaluate mathematical visualization. expressions, especially ones with multi- dimensional arrays (numpy.ndarray). 11. Scikit-Learn leverages Python’s breadth by building on top of several existing Python Using Theano it is possible to attain speeds packages — NumPy, SciPy, and matplotlib rivaling hand-crafted C implementations for — for math and science work. problems involving large amounts of data.

The resulting libraries can be used either for It was written at the LISA lab to support rapid interactive “workbench” applications or be development of efficient machine learning embedded into other software and reused. algorithms. Theano is named after the Greek mathematician, who may have been The kit is available under a BSD license, so Pythagoras’ wife. Theano is released under a it’s fully open and reusable. Scikit- BSD license. learn include tools for many of the standard machine-learning tasks (such as 15. Torch is a scientific computing framework clustering, classification, regression, etc.). with wide support for machine learning algorithms that puts GPUs first. 12. Shogun is among the oldest, most venerable of machine learning libraries, Shogun was It is easy to use and efficient, thanks to an created in 1999 and written in C++, but isn’t easy and fast scripting language, LuaJIT, and limited to working in C++. an underlying C/CUDA implementation.

Thanks to the SWIG library, Shogun can be The goal of Torch is to have maximum used transparently in such languages and flexibility and speed in building your environments: as Java, Python, C#, Ruby, , scientific algorithms while making the process Lua, Octave, and Matlab. extremely simple.

Shogun is designed for unified large-scale Torch comes with a large ecosystem of learning for a broad range of feature types and community-driven packages in machine learning settings, like classification, learning, computer vision, signal processing, regression, or explorative data analysis. parallel processing, image, video, audio and networking among others, and builds on top 13. TensorFlow is an open source software of the Lua community. library for numerical computation using data flow graphs. 16. Veles is a distributed platform for deep- learning applications, and it’s written in C++, TensorFlow implements what are called data although it uses Python to perform automation flow graphs, where batches of data (“tensors”) and coordination between nodes. can be processed by a series of algorithms described by a graph? Datasets can be analyzed and automatically normalized before being fed to the cluster, and The movements of the data through the a REST API allows the trained model to be system are called “flows” — hence, the name. used in production immediately. It focuses on

Volume 6, Issue 6, June 2019 676 http://ijics.com INTERNATIONAL JOURNAL OF INFORMATION AND COMPUTING SCIENCE ISSN NO: 0972-1347

performance and flexibility. It has little hard-  They provide a specific capability for one or coded entities and enables training of all the more steps in a machine learning project. widely recognized topologies, such as fully connected nets, convolutional nets, recurrent  The interface is typically an application nets etc. programming interface requiring programming. Machine Learning Platform  They are tailored for a specific use case, A machine learning platform provides problem type or environment. capabilities to complete a machine learning

project from beginning to end. Namely some data analysis, data preparation, modeling and Examples of machine learning libraries are: algorithm evaluation and selection. scikit-learn in Python.

Features of machine learning platforms are: JSAT in Java.

 To provide capabilities required at each step Accord Framework in .NET in a machine learning project.

 The interface may be graphical, command Machine Learning Tool Interfaces line, programming all of these or some Another useful way to think about machine combination. learning tools is by the interface they provide.

 To provide a lose coupling of features, This can be confusing, because some tools requiring that you tie the pieces together for provide multiple interfaces. your specific project. Nevertheless, it provides a starting point and  They are tailored for general purpose use perhaps a point of differentiation to help you and exploration rather than speed, pick and choose a machine learning tool. scalability or accuracy. Below are some examples of common  Examples of machine learning platforms interfaces. are: Graphical User Interface WEKA Machine Learning Workbench. R Platform. Machine learning tools provide a graphical user Subset of the Python SciPy interface including windows, point and click (e.g. Pandas and scikit-learn). and a focus on visualization. The benefits of a

graphical user interface are: Machine Learning Library A machine learning library provides capabilities  Allows less-technical users to work through for completing part of a machine learning machine learning. project. For example a library may provide a collection of modeling algorithms.  Focus on process and how to get the most from machine learning techniques. Features of machine learning libraries are:

Volume 6, Issue 6, June 2019 677 http://ijics.com INTERNATIONAL JOURNAL OF INFORMATION AND COMPUTING SCIENCE ISSN NO: 0972-1347

 Structured process imposed on the user by Application Programming Interface the interface. Machine learning tools can provide an  Stronger focus on graphical presentations of application programming interface giving you information such as visualization. the flexibility to decide what elements to use Some examples of machine learning tools with and exactly how to use them within your own a graphical interface include: programs. The benefits of application KNIME programming interface are: RapidMiner Orange  Can incorporate machine learning into your own software projects. Command Line Interface  Can create your own machine learning Machine learning tools provide a command line tools. interface including command line programs,  Gives you the flexibility to use your own command line parameterization and a focus on processes and automations on machine input and output. The benefits of command line learning projects. user interface are:  Allows combining your own methods with  Allows technical users that are not those provided by the library as well as programmers to work through machine extending provided methods. learning projects. Some examples of machine learning tools with application programming interfaces include:  Provides many small focused programs or program modes for specific sub-tasks of a Pylearn2 for Python machine learning project. Deeplearning4j for Java LIBSVM for C  Frames machine learning tasks in terms of the input required and output to be generated. Local Tools A local tool is downloaded, installed and run on  Promotes reproducible results by recording your local environment. or scripting commands and command line arguments.  Tailored for in-memory data and algorithms. Some examples of machine learning tools for a  Control over run configuration and command line interface include: parameterization.

Waffles  Integrate into your own systems to meet WEKA Machine Learning Workbench your needs

Volume 6, Issue 6, June 2019 678 http://ijics.com INTERNATIONAL JOURNAL OF INFORMATION AND COMPUTING SCIENCE ISSN NO: 0972-1347

Examples of local tools include: References

Shogun Library for C++ 1.https://machinelearningmastery.com/machine GoLearn for Go -learning-tools

Remote Tools 2.https://www.kdnuggets.com/2016/04/top-15- A remote tool is hosted on a server and called frameworks-machine-learning-experts.html from your local environment. These tools are often referred to as Machine Learning as a 3. https://www.infoworld.com/.../machine- Service (MLaaS). learning/11-open-source-tools-machine- learning.  Tailored for scale to be run on larger datasets.  Run across multiple systems, multiple cores and shared memory.  Fewer algorithms because of the modifications required for running at scale.  Simpler interfaces providing less control over run configuration and algorithm parameterization.  Integrated into your local environment via remote procedure calls.

Examples of remote tools:

Google Prediction API AWS Machine Learning Microsoft Azure Machine Learning

Conclusion

 This paper concludes that some open source tools and frameworks are used in the research areas. The tools are helped to produce the real time output. Some tools are equal to our gaming software.

 Some tools are similar to simulation tools the results are carried out in the form of graphs.

Volume 6, Issue 6, June 2019 679 http://ijics.com