Programming Languages

VIKEN CHOUCHANIAN California State University Northridge

Communication is a vital aspect of any society. The transferring of ideas, thoughts, information, or any other pertinent data could not be possible without a specific form of communication that all parties involved can understand. One form of communication is the use of languages. Humans have been using a variety of formal languages with great success for millennia all over the globe. When the need to communicate with machines arose, humans created programming languages to fully command their creations. The first programming languages actually predate any computers, and existed in the form of punch cards, used to guide textile looms. As humans gave birth to more advanced computers, programming languages also evolved to meet the intricate complexities of these new machines. Thousands of programming languages have been created, and each programmer can choose which language is optimal for his or her, based on their specific programming styles, hardware being used, and genre of the program being created. Some of the most widely used languages are Java, C++, C, PHP, Javascript, and Python, because they are fairly easy to learn and can be applied to a variety of general use programs. Similar to any spoken language, programming languages have certain rules that must be fol- lowed. Syntax can be considered the grammar of a programming language, allowing the computer to understand the vocabulary used by the programmer. Programmers can use the syntax and vocabulary of a language to create almost anything in the virtual world, have machines build a car, produce the next great video game, or create the next great phone application. Categories and Subject Descriptors: D.3 [Programming Languages]: General Terms: Languages Additional Key Words and Phrases: Programming Langugages

1. INTRODUCTION A key factor in the evolution, socialization, and civility of human beings has been communication. From the early grunts of our first homosapian ancestors, to the intricate nuances of the plethora of complex present day languages, mankind has needed a method to transfer ideas and concepts to one another. The invention of writing over 5000 years ago assisted in the transfer of concepts and helped advance human civilization in every aspect of our daily lives. Absence of a common lan- guage would make it nearly impossible to have a person effectively perform any task or explain an idea conceptualized by another. Formal language has evolved into a malleable entity constantly changing to meet the needs of modern humans and has become bridge between 21st century humans and the machines they have cre-

... Permission to make digital/hard copy of all or part of this material without fee for personal or classroom use provided that the copies are not made or distributed for profit or commercial advantage, the ACM copyright/server notice, the title of the publication, and its date appear, and notice is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists requires prior specific permission and/or a fee. c 20YY ACM 0000-0000/20YY/0000-0001 $5.00

ACM Journal Name, Vol. V, No. N, Month 20YY, Pages 1–0??. 2 · V. Chouchanian ated. Just as humans have passed on knowledge to each other for millennia, people now have the capacity to fully communicate with their electronic and mechanical creations through formal languages, that certain machines can understand. These languages, known as programming languages, are the key to future of computers, machines, and technology in general. Without these languages many modern day inventions would be reduced to expensive silicon and plastic paperweights. A programming language can be defined as a notational system for describing computation in machine-readable and human-readable form.[2] In simpler terms it is notation for communicating to a computer what you want it to do. Although, through programming languages humans and machines can communicate, there are limitations. Programming languages, unlike human natural languages, can only facilitate communication of algorithmic ideas, and cannot easily express or understand human emotion. Machines read the program, and do what they told, if they understand what is presented. Since the invention of the first complex machines, humans have been devising more intricate ways of communication with their machine to employ more efficient techniques to complete their goals.

2. HISTORY OF PROGRAMMING LANGUAGES The first programming language predates the computer by over 125 years. Pro- gramming languages first appeared as holes placed strategically on punch cards. These punch cards were fed into the Jacquard Looms and player pianos of the early 1800s to control a sequence of operations and created various patterns in the tex- tiles being created. By simply changing out the cards, the operator of the loom could change the pattern on the textile. This similarity to the early 5.25 and 3.5 inch floppy disks is uncanny. The first computer program was created by Charles Babbage in the early 19th century to run his conceptual Analytical Engine. The Analytical Engine would be able to calculate logarithms and trigonometric functions using his computer program. His original ideas, combined with various mathematical works throughout the late 19th century and early 20th century, were used by Alan M. Turing in his Turing Machines.[7] These machines were conceptualized in the late 1930s and they worked by manipulating various symbols written on a tape. These concepts gave birth to the ideas of computer memory, automated machines, and the basic working of the modern day CPU. The last of the primitive programming languages was used in the famous ENIAC and UNIVAC computers of the late 1940s and early 50s. Before 1954 almost all programming was done in assembly language.[6] Modern structured programming languages first appeared in the computers of the 1950s and 60s. The initial modern programming languages FORTRAN, LISP, ALGOL, and COBOL became the basis of the computing world in the middle of the 20th century, and many variations of these languages spawned a multitude of new languages.[1] Modified versions or FORTRAN, LISP, ALGOL and COBOL are still used in 21st century machines. As computers became more efficient and accessible to the public, programming languages were forced to match the increasing complexities associated with the vast increase of technology until the early 1970s. Here is an example of a simple FORTRAN II code:

ACM Journal Name, Vol. V, No. N, Month 20YY. · 3

The 1970s brought forth the concept of object-oriented programming. Object oriented programming began using data structures to create computer programs. A multitude of the programming languages developed after this period, incorporated object oriented programming into their basic ideas. Another huge accomplishment during the 70s was the creation of the C programming language. C or some variation of C is arguably the most widely used programming language in the world in 2010.

The Internet boom of the 90s gave birth to many of the languages used by pro- grammers today. Scripting languages such as Applescript, Javascript, and Python were born in this era, and paved the way for many other programming languages used for the virtual world of the web. Programming languages are constantly being created, updated, and used. There are currently tens of thousands of programming languages in the world. This is not surprising, because of our dependence on tech- nology and accessibility of powerful computers, designers have free reign to create any language they deem fit to solve any problem. Languages may be created for various reasons ranging from calculating salaries for a corporation to computing the grades of college students.

ACM Journal Name, Vol. V, No. N, Month 20YY. 4 · V. Chouchanian

3. HOW PROGRAMMING LANGUAGES WORK The study of any language, natural or artificial, focuses on the two fundamentals of the language: syntax and semantics. Syntax of language takes into account the format, form, and compositional structure of the language. For example the sen- tence: three red dog, is syntactically incorrect. The syntax of a computer language is defined as the form of its expressions, statements, and program units. Semantics on the other hand have to do with the meaning behind what is being expressed. The sentence: Thoughtful body greens instigate butterflies, is syntactically correct, but has no semantic meaning. Semantics, in programming language, refer to the meaning of the expressions, statements, and program units in the syntax of the language.

3.1 Syntax The syntax of a language can help determine popularity and ease of use of a specific programming language. If a programmer finds himself or herself struggling through the syntax of a language, the programmer can easily switch to one of the other languages currently available. Like natural languages, programming languages have symbols called characters. These characters are put together in strings and these strings combine to form sentences. The simplest form of these strings is called a lexeme. Lexemes are also separated into groups, called identifiers, by various criteria. The criteria for separation can include methods, classes, or names of variables. The categories these lexemes belong to are called tokens.[5] For example, the lexeme, index, is in the token named, identifier. These tokens are then parsed. Parsing recognizes the tokens in a sentence and determines its structure relating to the formal grammar of the language. Formal grammar of a language relates to the

ACM Journal Name, Vol. V, No. N, Month 20YY. · 5 syntax rules implemented. Formal grammar only relates to form and not content. The following are some lexemes in the C programming language.

Grammar of programming languages can be classified into various categories using the Chomsky hierarchy, a system devised by Noam Chomsky in the mid 1950s. The hierarchy consists of 4 classes for grammar; Type-0, Type-1, Type- 2, and Type-3. Type-0 consists of the unrestricted grammars. It contains all formal grammars, and they generate all languages that can be recognized by a Turing Machine. These types of languages are also known as recursively enumerable languages. Type-1 is made of context sensitive grammars, which generate context sensitive languages. Type-2 includes context free grammars, which spawn context free languages. Context free languages are usually the basis for the syntax of most programming languages. Type-3 consists of regular grammars, which create regular languages. Although there are other forms of grammars, they are not included in the hierarchy.

3.2 Semantics Once the syntax of a programming language is revealed and understood, the seman- tics of the language must be defined. Formal semantics of a programming language can be approached in three methods. —The first is denotational semantics, where every phrase in the language is imaged to a phrase in a different language. Usually the image language is mathematically based. This is the basic idea behind compiling languages. —The second class of semantics is operational semantics. Operational semantics execute the language presented directly, rather than translating to another lan- guage. This class gives meaning to elements in the code by describing the changes they make of the various states of the machine. —The third class is axiomatic semantics, where meaning is given to various phrases by relating the logical axioms that apply to each individual phrase. There is no difference between the actual meaning of the phrase and the formulas that are used to describe it. This form of semantics is based on logical methods and formulas.[3]

3.3 Types Data can be viewed as a collection of bits place in a certain order. Programming languages can take these bits, and build up all data from them. This could form a

ACM Journal Name, Vol. V, No. N, Month 20YY. 6 · V. Chouchanian machine within the computer and incorporate it as part of the programming lan- guage being used. This data can be classified into certain data types. A data type is a set of values, together with a set of operations on those values having certain properties. A type system views these data types, and defines how a programming language assigns certain expressions and values to them. These systems are in place to make sure that programs are error free and detect many errors in code. Some- times, the type systems work a little too well and find errors in perfectly written code. To circumvent these problems, many programmers can construct loopholes and bypass the type system. To ensure proper execution of code, a policing system must be maintained to solidify and enforce the grammar in place. Type checking is used to look for errors in code. Type checking is the process a translator goes through to verify that all constructs in a program make sense in terms of the types of its various entities. An algorithm is used which checks the equivalence between expressions and statements. Two types of type checking exist, static and dynamic. 3.3.1 Static Type Checking. Static type checking determines the types of objects and expressions from the actual text of the program. The check is performed before the execution of the program. Static checking allows to allocate memory and generate code that manipulates data efficiently, so execution efficiency is enhanced. It also reduces the amount of code that must be compiled improving translation efficiency. This type of checking also catches plenty of errors early, which improve the writability of a language. Static type checking make for a more rigid set of rules and reduces programmer flexibilities. 3.3.2 Dynamic Type Checking. Dynamic type checking on the other hand is checked during the running of the program and tends to be more flexible than the static checking. Certain programming languages like JavaScript, due to their dynamic type binding, allow only dynamic type checking. Using type checking type strength of a programming language can be determined. Strong typing is a highly valuable characteristic of a programming language. A program is strongly typed if errors in the code are always detected. This requires that all elements of operation can be specifically determined either using dynamic or static type checking methods. Conversely, a weakly typed language allows an element to be imaged as a different element type. Although this may be useful in many instances, this can wreak havoc, allowing many errors to pass though the error detection process. 3.4 Libraries The 1990s brought into focus the use of libraries. Libraries are a collection of classes or subroutines used to develop software. Historically, libraries were deemed to be unimportant in programming languages, because they were not part of the actual language. They were seen as an interface issue rather than a coding or programming concern. Some languages completely ignored the use of libraries altogether, and opted to go towards having the input-output functions incorporate into the language itself. However, with the evolution of technology to the high levels of the 21st century, the idea of libraries is essential. Libraries full of rich capabilities, are being written in system independent ways, are being incorporated into the languages

ACM Journal Name, Vol. V, No. N, Month 20YY. · 7 themselves. This new approach to libraries is imperative to have a successful highly functioning language. A perfect example of a highly used, modern language than depends on libraries is Java. As one of the most popular languages in the world, Java may have just disappeared into the realm of lost programming languages if it had not used the Java Application Program Interface. C++, yet another extremely popular language, uses a standard containing many utilities which make C++ the successful language it is.

3.4.1 Library Types. Just as programming languages and technology has evolved from from humble beginnings to the powerful driving forces of the 21st century, so have libraries. Initial libraries were fairly simple and were not as involved as the giant forces they have become. The various types of libraries are: —Static Libraries: Static libraries were the first libraries in existence. Originally, all libraries were static, and referred to as archives. These archives were made of routines, which were copied by a linking agent, usually a , to the desired application. This act, called a static build, produced object files, and an file. The linking agent loads all the codes and libraries into the proper memory location, which can take long amounts of time, depending on the operation. Some languages have incorporated smart linking, which allows the compiler to recognize the libraries needed and only use the essential aspects needed. This reduces memory use and runtime. —Dynamic Linking: Dynamic Linking helps reduce by loading the subroutines of a library. All the .DLL files which accompany applications are responsible for speeding up the actions performed while running the application. This is one of the reasons that many modern programs may take some time while starting up, but are very fast while executing requests. One disadvantage to dynamic linking is that the application running depends on the linking agent to find and order the stored libraries. If any of the libraries are deleted, misplaced, renamed, damages, incomplete, or moved, the library cannot be copied and the executable file will fail. —Shared Libraries: Shared libraries are another classification of libraries, which relates to the availability of the library to a variety of programs. Two concepts are incorporated in shared libraries. First, a library is shared if it is shares code located on the local disk by other programs unrelated to each other. Second, is when the library shares code with the memory, or RAM. Dynamic libraries are almost always shared, while static can never be shared due to their static nature. The most popular operating systems uses shared libraries because it allows for only one loader, and secondly the applications and other executable files to be

ACM Journal Name, Vol. V, No. N, Month 20YY. 8 · V. Chouchanian

used as dynamic libraries. —Remote Libraries: Remote libraries are libraries located on another server or computer. These libraries are accessed over a network using a remote procedure call. This allows for a computer or machine located in a different country to use the same libraries in the United States. —Object-Oriented Libraries: Just as object-oriented programming has increased in popularity, it only makes sense that object libraries follow. Object-oriented programming requires various data that which are not found in normal libraries. These special libraries must contain a list of objects that the entry point and names of the code depend on. Cross platform applications and systems usu- ally use object libraries to expand their overall audience. Many developers like IBM, COBRA, and Sun Microsystems quickly jumped on these opportunities to increase their profits. —Class Libraries: Very similar to the object libraries are the class libraries. Class libraries represent object libraries, but usually relate to older type codes. These class libraries help describe methods and characteristics of objects. [1]

4. CONSIDERATIONS WHEN CREATING A NEW PROGRAMMING LANGUAGE Creating a new language cannot be based strictly on syntax and semantics. Cer- tain criteria must be met when designing a new language. A programmer must constructively determine the direction the new program will be headed with each of the following: —Purpose: The programmer should determine the overall goal of the language. If the program is going to designed for general purpose, or for a specific task, like accounting. By determining this aspect of the language, the programmer will be more efficient in reaching his or her goals. —Abstraction: Factoring out recurring patterns and sub-procedures will have a favorable outcome for future programmers of the language. —Simplicity: The less complicated a language is, the more favorable it will be for all to use. By limiting the concepts required to program in a specific language, it will simplify the overall structure of the language. —Orthogonality: Basic entities should be separately understandable and interact in an expected manner, separate from other entities. —Regularity: Exceptions to the rules within a language must be few and far be- tween. In the English language, i before e EXCEPT after c would be an exception. If there were hundreds of exceptions, the rule would have no substance. —Translation: A translator for any language must run efficiently and quickly to allow for concise programming. —Consistency: Similar constructs should look and act similarly. Different con- structs should appear and perform differently.[2] By combining the previous characteristics into a new programming language, this medium of communication can be completely efficient and useful. The language can thrive and be useful to all users.

ACM Journal Name, Vol. V, No. N, Month 20YY. · 9

5. PROGRAMMING PARADIGMS Programming languages mimic the operations of the computer they are running on. Therefore the computer they are designed for has a significant effect of how the programming language is created and which characteristics are attributed to the language. Various attributes of a programming language will determine the computational paradigm of the language. The following are different paradigms.

—Imperative Paradigm: Instructions are executed sequentially, variables are used to represent memory locations, and assignments are used change the values of variables. Imperative languages are also referred to as procedural languages, due to the sequence of statements that represent the commands. Most programming languages currently used are imperative.

—Functional Paradigm: Based on mathematics and the abstract notion of a func- tion in lambda calculus. This paradigm bases the description of computation on the evaluation of functions or the application of functions to known values. Lan- guages incorporating the functional paradigm are sometimes called applicative languages. The functional paradigm uses a functional call, where the program evaluates a function, transfers values as parameters to certain functions, and re- turns values from functions. LISP is an example of a functional programming language.

—Logic paradigm: Logic programming is based on the symbolic logic. These lan- guages are based on a set of statements that describe the truth of a statement, rather than giving a sequence of sentences that are restricted to be executed in a particular manner. These languages have no need for loops, and the only necessity is the statement of properties of the computation. Since all properties are declared and there is no sequence of execution, logic programming is referred to as declarative programming. The only widely used logic based language is Prolog.

—Object-oriented Paradigm: This paradigm is based on the idea of an object. Ob- jects can be described as a collection of locations together with all the operations that can change the values of these memory locations. An example of an object is a variable. In many object-oriented languages, objects are put into classes that represent all of the objects with the same characteristics. These classes define four things. First, a constructor allocates memory and provides an initial value for the data of an object. Second, a way to access the value from the first part of the class is determined. Then the procedures are executed and a value is defined. Object-oriented programming is found in numerous new languages and seems to be a staple for the future of programming.[2]

The following is a sample of C++ code:

ACM Journal Name, Vol. V, No. N, Month 20YY. 10 · V. Chouchanian

[4]

6. WHICH LANGUAGE IS BEST? Evaluating a language entails a variety of factors that can differ from person to person. However, there are two factors that are common in almost all who compare and contrast different programming languages. Reliability and cost trump most other characteristics of a programming language. For example, Java demands that all references to array elements are checked to make sure that various rules are followed and indices are within their ranges. Although this step ensures reliability, it also ensures that the cost, both in memory, speed, and money are high. The C programming language does not require this checking to be performed, so C executes faster, uses less memory, and can actually cost less monetarily. But the tradeoff for the lower cost is a reduction in overall reliability, which can doom a program after it is created. Determining which language is greater than all the others is similar to trying to determine which aspect of medical science is the best. Each division of medicine is crucial to keep humans healthy, however, with each decade a new subject may be the focus of the medical community. Computer science is similar in the way that new ideas are accepted or rejected. With the boom of the Internet in the late 1990s many programming language appeared with the Internet in mind. As technology advances towards artificial intelligence, languages may veer towards that branch of computer science and new types of languages may appear. Simply finding the best language is what works best for the situation, and for the programmer. A programmer comfortable in C++ will determine that C++ is the best, while a person versed in Python will defend his or her language. The best language is one that suits the program trying to be created, can be applied correctly to the situation, and comfort level of use is high. Any language whether natural or artificial, if mastered, can be used to create great works of art that can be cherished

ACM Journal Name, Vol. V, No. N, Month 20YY. · 11 for years to come.

REFERENCES Naomi S. Baron Computer Languages, How to Make Your Way Through the New Language Maze, A Guide for the Perplexed 1986: Anchor Press/Doubleday, NY. Kenneth C. Louden Programming Languages, Principles and Practice, Second Edition 2003: Brooks/Cole, CA. Robert Noonan, Allen Tucker Programming Languages, Principles and Paradigms 2002: McGraw-Hill, NY. Functions, University of Regina Computer Science Department, www.cs.uregina.ca/Links/class-info/cplusplus/Function.html (10/31/10) Robert W. Sebesta Concepts of Programming Languages, Eighth Edition 2008: Pearson, MA. Richard L. Wexelblat History of Programming Languages 1978: Academic Press Inc., NY. Ryan Stansifer The Study of Programming Languages 1995: Prentice-Hall Inc, NJ. Dr. Mark Utting Programming Languages University of Waikato, www.cs.waikato.ac.nz/ marku/languages.html(10/31/10)

ACM Journal Name, Vol. V, No. N, Month 20YY.