<<

Python: The Most Advanced Programming for Science Applications

Akshit . Dhruv, Reema Patel and Nishant Doshi and Engineering, Pandit Deendayal Petroleum University, Gandhinagar, Gujarat

Keywords: Python, Python libraries, Memory allocation, , Framework.

Abstract: In last few years, there has been an advancement in programming due to different libraries that are introduced. All the developers in this modern era prefer that provides a built-in module/ can their work easy. This paper describes the advancement of one such language “Python” and it’ increasing popularity through different statistical data and graphs. In this paper, we explore all the built-in libraries for all different computer science domains such as Data Science, Learning, Image Processing, Deep Learning, Natural Language Processing, Data , Cloud Computing, Speech recognition, etc. We have also included Memory management in Python. Different frameworks for Python which can make the front-end work easier are also mentioned.

1 INTRODUCTION 2

In 1991, Python language was developed by Guido Data structure means organization, management of van Rossum. There is an interesting story behind data and also it is a storage format which provides giving the name “Python” to the programming efficient access and modification. In general, it language. At the time of development of python, the contains relation among them, and the functions or developer was reading the script “Monty’s Python operations that can be applied to the data. Flying which is a BBC series. While reading this • : It helps in improving the speed of the book he got an idea to name the programming of the code. language as “Python” to have a short and unique • PYTables: It is used in maintaining name. Python is object- oriented, interpreted, and hierarchical datasets and is also used to interactive programming language. It provides high- maintain an extremely large amount of data. level data such as list, tuples, sets, • Tree Dict: It works as a container for python associative arrays (called dictionaries), dynamic to simplify the bookkeeping surrounding typing and binding, modules, classes, exceptions, parameters, variables, and data. It is very automatic memory management, etc. It is also used stable and fast at work. for system and has a comparatively simple and easy for coding Table 1: Different Data Structure in Python. and still it is a powerful programming language. Type Definition Example Python has the for known as

JPython, which is similar to the interpreter for It is defined A list is a by square mutable data List=[1,2,3] language. Python has many advantages over any List braces [ ]. other languages, like it has varieties of library which structure, ordered sequence of reduces the code to one-third for and element. due to this Python has reached at the +highest peak in terms of . Difficulty is faced by many while solving problems(Lawan et al, 2015), this research will help providing knowledge about different libraries and motivate them to use Python.

292

Dhruv, A., Patel, . and Doshi, N. Python: The Most Advanced Programming Language for Computer Science Applications. DOI: 10.5220/0010307902920299 In Proceedings of the International Conference on Culture Heritage, Education, Sustainable Tourism, and Innovation (CESIT 2020), pages 292-299 ISBN: 978-989-758-501-2 Copyright c 2021 by SCITEPRESS – Science and Publications, Lda. All rights reserved

Python: The Most Advanced Programming Language for Computer Science Applications

Dictionary are the capability of adding rich media, also called observations, shell syntax, backup of HashMap or history, and tab completion. associative arrays, It is defined by Dict= Dictionary which means that {} (https://pypi.org) It is also used in {1:”a”,2:”” an element of the by using IPython as fix interpreter. The usage } list is associated of Mathematica or MATLAB makes it with the comfortable to work with IPython. It is also definition, rather like Map in Java. used in Data Structure. • : Video games are created easily using It is a collection of It is defined Pygame. The library has computer graphics Set={1,2,3 Set unordered and by brackets } and sound libraries which are specially made unique immutable {} for python programming language. objects. • SQLAlchemy: It provides a common interface for creating and executing -agnostic code without the need of writing SQL 3 BUILT-IN LIBRARIES IN statements. It is also used in data structure. PYTHON FOR COMPUTER • Scrapy: This library is used to design scraping, and also it can be used to get data SCIENCE APPLICATIONS using or it is used as a general-purpose web crawler. 3.1 Data Science • Pywin32: This library is used to create COM objects and the Pythonwin environment. Data Science is to develop a different approach to • wxPython: GUI toolkit for the Python record, store, and analyse the data and using this programming language can be obtained by data to get effective information. Data science aims this library. Applications made using this has at achieving ideas and knowledge from any type of native appearance on all platforms. data. • : It allows you to build and web Python provides number of libraries for the same apps very fast and efficiently. as listed below: • Nose: It runs tests or directories whose name • : 2D plot graphs can be made using includes “test” at the end of the word. To ease Matplotlib library. out the print- style debugging, it includes • : in finance, statistics, captured stdout output from failing tests. social science, and engineering require • Sympy: It is used for symbolic . It different types of data structure and tools tries to keep the code as simple as possible in which are provided by Pandas. process of making a full-featured computer (https://pypi.org). algebra system (CAS). • : It is the library for scientific • Fabric: Fabric along which is acting as library computing in Python. (https://pypi.org) for Python, is also a command line interface Multidimensional arrays and matrices can be tool for increasing the use of SSH for the done using objects in NumPy, and also application arrangement or systems routines are provided which allows developers administrations. The main use of this library is to compute advanced mathematical and to create a module which contains one or more statistical functions on those arrays with code functions, and then executing them through if possible. It is also used in Data Structure. fab command-line tool. • SciPy: Manipulation and visualization of data • Pillow: Python Imaging Library which adds is done using a high-level command provided the support for different options like opening, in SciPy. Functions for solving Integrals manipulating data, and saving images as numerically, computing differential equations, different file formats. It is also used in Image and optimization are included in the package. processing. The library SciPy is also used in Image • Statsmodels: Statistical Models can be processing. estimated using this library. Also it can • IPython: Using Ipython, an efficient explore data and perform statistical test. It is interactive shell gets added along with the also used in machine-learning. functionality of Python’s interpreter that has

293

CESIT 2020 - International Conference on Culture Heritage, Education, Sustainable Tourism, and Innovation Technologies

3.2 Machine Learning 3.3 Deep Learning

Machine learning can also be considered as a subset Deep Learning can also be called part of Machine or part of that can learn learning. It has a layer of Artificial Neural Network automatically and make changes itself from the which can learn the unstructured or unlabelled data. experience without being externally programming it.( (Machine Learning and Deep Learning frameworks Machine Learning and Deep Learning frameworks and libraries for large-scale ). and libraries for large-scale data mining). • Apache MxNet: It permits to use • Keras: It is a neural networking API and it is to symbolic and crucial coding to increase execute for the machine learning beginners to productivity and efficiency Inside MxNet build and design neural networks. It is also there is a modern dependency scheduler that used in deep learning. will help to automatically parallelizes both • Shogun: For a wide range of efficient and symbolic and imperative operations quickly. unified machine learning methods, Shogun • Caffe: Expression, modularity, and speed are library is used which is an open source the key features of this library. library.[5] • Fastai: It simplifies the training of neural nets • XGBoost: XGBoost is decision tree that uses very quickly and with accuracy and using the the to solve the predictive modelling latest technique. It includes the support for problems, and this algorithm is efficient and text, vision, and tabular models. (Machine speedy. Learning and Deep Learning frameworks and • Scikit-learn: It is used for classical ML libraries for large-scale data mining). . It supports direct and indirect • CNTK: Neural networks are defined in a learning algorithms and also be used for data directed graph by a series of computational mining and data analysis. It has many steps that are described by the toolkit of applications but majorly unit-testing and self- CNTK. In this graph, input values are verification are done using Scikit-learn to represented by leaf nodes or network detect and diagnose different types of error. parameters and other represent • CatBoost: It will read the documents and operation which took place on the inputs. analyse the model and data with CatBoost • TFLearn: High level API is provided to analysing tools. TensorFlow using this library. This library • PyTorch: PyTorch allows the user to work on helps to get the quick outputs of the Tensors and GPU accelerations by experimentation and also the process is implementing it in C with a wrapper in Lua. It transparent. also can create computational graphs. • Lasagne: It is used to build and train neural • Eli5: It is a python library which is used to networks using Theano. Convolutional Neural debug classifiers in machine learning and Networks (CNNs), recurrent networks are explain the predictions. supported by this library. • : It is the GNU based scientific library • Elephas: Deep learning’s distributed models and make the extensive use of Cython can run at language. scale with the use of spark using this library. • Nilearn: Statistical learning on Neurolmaging • Theano: Mathematical expression for multi- data can easily be learned and understood dimensional arrays can be optimized, evaluate using Nilearn library. and can also be defined in this library. Theano • TensorFlow: It is used for high-performance is also used in Machine Learning. numerical computation. It can develop many artificial Intelligence applications by 3.4 Image Processing implementing the deep neural networks. It is also used in Deep Learning.(Machine Learning Image processing is specially used to do some and Deep Learning frameworks and libraries operations on an image to get a better-quality image for large-scale data mining) or to find some useful information from it. It works According to the GitHub survey from January like signal processing in which we take input as 2018 to December 2018, Python is at the top in image and output may vary, like it can be image or terms of machine learning compared to C++, characteristic features which is associated with that Java, JavaScript, etc. (https://github.com). image.

294

Python: The Most Advanced Programming Language for Computer Science Applications

• Scikit Image: It is a collection of algorithms 3.6 Networking for image processing and uses NumPy arrays as image objects. It includes algorithms for Python provides two-level access to networking. geometric transformation, segmentation, One low level, in which one can access the basic colour space, analysis, manipulation, filtering, socket support in the same OS that permits morphology, feature detection, etc. implementation for clients and servers to do • Open cv-python: It is developed by Intel for connection-orientation and connectionless protocols. real-time image & video analysis and • asyncio: It provides base for writing the processing. existing code in a single sequence using • Mahotas: Functions such as morphological coroutines, multiplexing I/O access over operations, modern computer vision functions sockets and different resources, which are and filtering for the advanced computation running network clients, servers, and other and includes the interest point detection also. related primitives. To detect common issues • Cairo: It acts as a 2D graphics library for during development, debug mode is enabled. python and also supports many output devices. • Tftpy: It includes client and server classes and Using display hardware acceleration, it gives create a TFTP server/client to receive/send continuous output on all connected devices. files. • lib: It provides a telnet client 3.5 Game Development implementation, so it represents a connection to a Telnet Server. Game Development is used to create games and • Parmiko: It is an implementation of SSHv2 describes the design, development, and release of a protocol, providing the functionality of client game. Before game development, it is important to and server both. SFTP client and server mode think about the game mechanics, rewards, player are supported. engagement, and level designing. • Requests: All kinds of HTTP requests are sent • Pyglet: It is a cross-platform windowing and in python using this module. multimedia library for Python, developed to create games and other visually rich 3.7 Natural Language Processing applications. It has feature which can load images, sound, music, and video in almost any Natural Language Processing shows the connection format. between human language and . It is used • Arcade: It is used to create 2D video games. It in businesses and it is a very important term in every helps the developer to create the 2D games engineer’s life. without learning complex frameworks. • : It is used for topic modeling, • Rabbyt: It is a sprite library for python which similarity revival and document indexing with provides fast performance with an easy to use large corpora. It includes features like all the but flexible API. algorithms are memory- independent • Pymunk: It is a pythonic 2D physics library compared to the intuitive interfaces, corpus that can be used whenever there is a need for size, efficient multicore of 2D rigid body physics from Python. popular algorithms, distributed computing etc. • Pybox 2D: It is purely 2d engine written • Textblog: This library is used for processing primarily for games. It includes features like written data. It provides API for part-of- circles, up to 16 sided polygons, thin line speech tagging, sentiment analysis, noun segments, controllers, basic breakable bodies, phrase extraction, classification, translation, and pickling support. WordNet integration, word inflection, , • Panda 3D: It is used for 3D rendering and add new models or languages through game development. It also support automatic extensions. shader generation, which means that we can • SpaCy: It is a Natural language Processing use normal maps HDR, gloss maps, cartoon library of Python which contains pre-trained shading glow maps. statistical models, word vector and also it has support tokenization for 49+ languages. • Vocabulary: It is a Python library, which is used to get the meaning, synonyms, opposites, part of speech, translations for a given word.

295

CESIT 2020 - International Conference on Culture Heritage, Education, Sustainable Tourism, and Innovation Technologies

• PyNLPl: PyNLPI is also read as “Pineapple”, types of data such as Python lists, tuples, which is used in extracting a sequence of n NumPy arrays or Pandas Data Frames to make items in a list and also used in building a the plots. simple language model. This library covers a • Py.gal: Formats for the vector graphic are large area in the field of working with FoLiA SVG whose charts can be made from python XML. using a library called PyGal. • NLTK: Lexical resources such as WordNet • Ggplot: It is a very helpful library for the requires an interface to run in Python which is who are coming from the R provided by NLTK and also provides library background and used ggplot2 in it, as the same called text processing for tokenization, Ggplot is used in python. classification and stemming. It has built- in • Plotly: It is an interactive library use for which provides practical ideas to plotting which supports 40 different types of programming for language processing. graphs covering a wide range of financial, • CoreNLP: Using this library it is easy to statistical, geographic, scientific, and 3 groups of linguistic analysis tools to a piece of dimensional. text. It includes many tools such as: Part-of- • Missingno: Many times the datasets are speech recognizer (POS) tagger, the missing and they are represented by NaN (not conference resolution system, the parser, the a number). So this library provides a way to named entity recognizer, sentiment analysis, visualize the distribution of NaN values. It is the open information extraction tools, and compatible with Pandas. bootstrapped pattern learning. • Leather: It makes the work done fast but not • Pattern: It has tools for data mining, natural with perfection, like someone needs charts but language processing, machine learning, doesn’ expect any type of accuracy or network analysis using graphs centrality and perfection can use this library. visualization. • Pydot: It provides a complete interface to create, handle modify and process graphs. 3.8 Cloud Computing 3.10 Speech Recognition. Cloud Computing uses a network of remote server which is hosted on the to store, process, and Speech recognition is used to convert the human manage the data, rather than a local server or a voice to computer understandable language using personal computer. different software or different hardware. It has many • Apache Libcloud: It is a python library for applications, like to give command to computer do cloud computing and it does work of hiding perform any particular task without even writing or difference between different cloud provider working physically. APIs and allows us to manage different cloud • CMU sphinx: It contains the best toolkit with resources through a unified and easy to use many different tools used to build speech API. applications. • google--python client: It allows us to work • Google Speech Recognition: Provides facility with Google APIs such as Google+, YouTube likes or Drive on our server. They are officially o Configure Microphone. supported by Google. o Set chunk size. o Set the sampling rate. 3.9 o Set device id to selected microphone. o Allow adjusting for unknown noises. Data visualization is used to represent the • Bing Voice Recognition: information in the of a chart, diagram, and o Microsoft Heera pictures. o Microsoft Zira • Seaborn: It is a matplotlib based library that o Microsoft David provides an interface of high-level for making o Microsoft Mark catchy and informative statistical graphics. o Microsoft Ravi • Bokeh: It is used as an interactive are some of the in-built voice used in speech visualization library that can target web recognition app browsers for representations. We can pass all

296

Python: The Most Advanced Programming Language for Computer Science Applications

• Houndify API: It is used in creating a client Table 2: Difference between Python and C. that converts speech to text in under a few Python C Language minutes. It provides a simple way for No data is stored during Data is stored in “Stack” developers to use the platform for its speech to . during compile time. text capabilities through the speech to text During run time the During run time the data only domain. data is • Api.py: It is used to combine speech is stored in “heap”. stored in “heap”. recognition with natural language processing. There is no use of Variable is also stored in • Ppytsx3: It is a text to speech python library variable in storage. memory. which works without internet connections or All the values are stored All the values are stored any type of delay. as making the using memory location as • PyAudio: It is the cross-platform audio I/O memory optimized. well as variable in which it is stored. library. This library helps to play and record After the of After the execution also audio streams on several platforms. The program, if the the memory area stays Reference count is 0 occupied unless the 3.11 Cryptography then The garbage command is not given collector will release the externally to free the This library provides a set of procedure to decrypt memory on its own. memory. messages and to secure among computer system, matplotlib, etc. 4.1 Second Section • PyNacl: It is a bounded library called libsodium which is a collection of the Networking and cryptography library. • PocketProtector: It is a library that contains a secret management system. • Pycryptodome: It is low-level security providing library but still used for home purposes.

4 MEMORY MANAGEMENT IN Figure 1: Survey of 2017. PYTHON

Memory management is an important factor that everyone checks before choosing any programming language. Memory management is allocating particular block to programs and reduce the overall space required and increases system performance. Comparison in memory allocation in other languages and Python: int a = 10; int b= 10;Then in C language, it is stored as variable and both a and b will be provided a different memory space. While in Python it is stored as reference. When we enter value as 10 the reference count of 10 becomes 1 and when the value of b is added as 10 and the reference count becomes 2. Figure 2: Survey of 2019

Comparing the graph of 2017 and 2019 we can see it has raised its position from 5th rank to 4th rank. Also, the number of users of Python in 2017 was 32% which increased to 41.7% in 2019.

297

CESIT 2020 - International Conference on Culture Heritage, Education, Sustainable Tourism, and Innovation Technologies

4.2 Usage of Python 4.5 Top Websites Built using Python

• The top websites built are , Spotify, Netflix, Google, Uber, Dropbox, Pinterest, Instacart, Reddit, and Lyft.(https://learn.onemonth.com/10-famous- websites-built-using-python/).

5 THIS SECTION INCLUDES INFORMATION ABOUT THE DIFFERENT FRAMEWORKS USED IN PYTHON Figure 3: Graph showing usage of Python Frameworks: A collection of different 4.3 Average Python Programmers modules/packages which are used by developer to write web-applications or services without Salary by States: 2019 requirement of handling minor details such as protocols, socket, or process management. (Kumar, Dahiya)

5.1 Full Stack Framework

It is a framework that tries to provide nearly everything i.. from web serving to database management right down to HTML generation — Figure 4: Salary chart by states. which a developer could need to build an application.(Nithya et al). Here we can see that California has the highest Few Full Stack Frameworks : paid salary to the programmer and slowly there will be others joining   Giotto  Zopez them.(https://www.daxx.com/blog/development-  Turbo Gears   kiss. Py trends/python-developer-salary-usa)   Pylon  Lino  Cubic web  Reahi  4.4 Python’s Rating   wheezy.web  Pylatte 5.2 Non-full Stack

These frameworks do not provide extra functionalities and features to the developers. They have to add huge code and components manually here. Few Non-Full Frameworks are:   AppWsgi  Clastic  Cherry.Py  BlueBream  Divmod  Flask  More Path  Hug  Bobo  Falcon

 Pyramid  Bocadillo  Growler Figure 5: Rating of Python from time it was developed.

This graph shows the increment in rating since the time it is developed.

298

Python: The Most Advanced Programming Language for Computer Science Applications

6 CONCLUSION

Python is growing rapidly and has reached to 3rd rank in terms of best programming language. It is seen that; Python has many libraries which makes it unique from other programming languages. It’s popularity and ratings are increasing day by day along with the demand of Python programmers all over the world.

REFERENCES

Machine Learning and Deep Learning frameworks and libraries for large-scale data mining: a survey. Lawan, A. A., Abdi, A. S., Abuhassan, A. A., Khalid, . S., 2019. What is Difficult in Learning Programming Language Based on Problem- Solving Skills?," International Conference on Advanced Science and Engineering (ICOASE), Zakho - Duhok, Iraq. Lo, C., Wu, C., 2015. Which Programming Language Should Students Learn First? A Comparison of Java and Python," 2015 International Conference on Learning and Teaching in Computing and Engineering, Taipei, pp. 225-226. Prof. B Nithya Ramesh, Aashay R Amballi, Vivekananda Mahanta, “Django Python Framework”, International Journal of Computer Science and Information Technology Research ISSN 2348-120X (online) Vol. 6, Issue 2. Krishan Kumar, Sonal Dahiya, “Programming Language a Survey, International Journal on Recent and Innovation Trends in Computing and Volume: 5 Issue: 5.

299