MATEC Web of Conferences 164, 01037 (2018) https://doi.org/10.1051/matecconf/201816401037 ICESTI 2017

Automatic Essay Scoring in E-learning System Using LSA Method with N-Gram Feature for Bahasa Indonesia

Rico Setiadi Citawan*, Viny Christanti Mawardi, and Bagus Mulyawan

Faculty of Information Technology, Tarumanagara University, Letjend S. Parman No. 1, West Jakarta 11440, Indonesia

Abstract. In the world of education, e-learning system is a system that can be used to support the educational process. E-learning system is usually used by educators to learners in evaluating learning outcomes. In the process of evaluating learning outcomes in the e-learning system, the form type of exam questions that are often used are multiple choice and short stuffing. For exam questions in the form of essays are rarely used in the evaluation process of educational because of the difference in the subjectivity and time consuming in the assessment process. In this design aims to create an automatic essay scoring feature on e-learning system that can be used to support the learning process. The method used in automatic essay scoring is (LSA) with n-gram feature. The evaluation results of the design features automatic essay scoring showed that the accuracy of the average achieved in the amount of 78.65 %, 58.89 %, 14.91 %, 71.37 %, 64.49 % in the LSA unigram, , , unigram + bigram, unigram + bigram + trigram.

Key words: Automatic essay scoring, e-learning, latent semantic analysis, n-gram, singular value decomposition

1 Introduction Along with advancement of internet technology rapidly allows the application of technology in various fields including one in the education field. One of the technologies that can be utilized to support the learning process is e-learning system. In e-learning system many features are developed to support the learning process such as online exam features who provided by educators to learners in evaluating learning outcomes. In e-learning system, evaluation of learning outcomes in essay form is important because evaluation results in essay form can be used by educators as an indicator to measure the level of learners understanding of the exams given. Therefore, to solve the problem is designing e-learning system with automatic essay scoring feature that can be used to support the learning process between educators and learners. There are several methods that can be used to perform assessments in automatic essay scoring such as string matching algorithms like Rabin Karp, Knuth Morris Pratt,

* Corresponding author: [email protected]

© The Authors, published by EDP Sciences. This is an open access article distributed under the terms of the Creative Commons Attribution License 4.0 (http://creativecommons.org/licenses/by/4.0/). MATEC Web of Conferences 164, 01037 (2018) https://doi.org/10.1051/matecconf/201816401037 ICESTI 2017

Levenshtein distance, etc. However, for this designing is used method Latent Semantic Analysis (LSA) which can be used to find the similarity between answer student essays with answer key model lecturer in automatic essay scoring. LSA is a method in information retrieval that represents words along with sentences in a matrix with mathematical calculations. Mathematical calculations are performed by mapping the presence or absence of words from word groups in the matrix. LSA method uses a linear algebra technique called Singular Value Decomposition (SVD) to find patterns of relationships between words and contexts that contain the words in semantic matrix. Generally existing Automatic Essay Scoring (AES) techniques which are using LSA method do not consider word order in a sentence. To solve the problem, the LSA method used in automatic essay scoring for this designing will be added with n-gram feature so that word order in sentence can be considered.

2 Literature Review In this case, the Latent Semantic Analysis (LSA) method with n-gram feature are used to create an automatic essay scoring in e-learning system.

2.1 Pre-processing Pre-processing is an early stage in processing input data before entering the main stage process. Pre-processing consists of several stages. The pre-processing stages are performed in the system are case folding, , tokenization. Here is an explanation of the pre-processing steps performed as follows: i) Case Folding Case folding is the stage that changes all the letters in the corpus into lowercase. The purpose of case folding is to uniform input data because the input can be lowercase and uppercase. ii) Lexical Analysis Lexical analysis is the process of eliminating numbers, punctuation and characters which cannot be read by the system. iii) Tokenization Tokenization is the process of cutting or separating words from a collection of text or corpus.

2.2 Latent semantic analysis (LSA) LSA stands for Latent Semantic Analysis is an information retrieval method that works on a set of texts in which words can be represented along with their context in a matrix with mathematical calculations. The formed matrix is a mapping of the presence or absence of a word or term from a particular group of words or terms. In addition, the context used can be a list of keywords, sentences, even paragraphs. Latent Semantic Analysis (LSA) was proposed and patented in 1988 by Scott Deerwester, Thomas Launder, and Susan Dumais [1]. This method is widely used in the field of Information Retrieval and Natural Language Processing. One example of the use of the LSA method is the assessment of essay answers. The LSA method uses the mathematical technique of linear algebra called Singular Value Decomposition (SVD) in forming semantic matrix. The semantic matrix that has been formed is a matrix that contains the pattern of relationship between words with the context in matching between the answer and answer

2 MATEC Web of Conferences 164, 01037 (2018) https://doi.org/10.1051/matecconf/201816401037 ICESTI 2017

Levenshtein distance, etc. However, for this designing is used method Latent Semantic key [2]. LSA methods generally work based on the unigram word occurring simultaneously Analysis (LSA) which can be used to find the similarity between answer student essays in certain contexts so that LSA method is more concerned with the keywords contained in a with answer key model lecturer in automatic essay scoring. sentence regardless of the word orders and grammar in a sentence, Therefore, to solve the LSA is a method in information retrieval that represents words along with sentences in a problem, in this LSA method will be added n-gram feature so that LSA method can work matrix with mathematical calculations. Mathematical calculations are performed by on the appearance of word and phrase in certain context so that word order in sentence can mapping the presence or absence of words from word groups in the matrix. LSA method be considered [3]. uses a linear algebra technique called Singular Value Decomposition (SVD) to find patterns The designing of automatic essay scoring using LSA method, the first step is to of relationships between words and contexts that contain the words in semantic matrix. represent the student answer and lecturer answer key in a matrix by forming term frequency Generally existing Automatic Essay Scoring (AES) techniques which are using LSA matrix of student answer and term frequency matrix of lecturer answer key. The term method do not consider word order in a sentence. To solve the problem, the LSA method frequency matrix is formed by counting the number of times the word appears on the used in automatic essay scoring for this designing will be added with n-gram feature so that lecturer answer key. The frequency of occurrence of a word or term that counts can be a set word order in sentence can be considered. of unigram, bigram, trigram, or all of them. The formed term frequency matrix consists of two are the term frequency matrix of the lecturer answer key and the frequency term matrix of the student answer. The frequency 2 Literature Review term matrix for the lecturer answer key consists of rows and columns where each row in the In this case, the Latent Semantic Analysis (LSA) method with n-gram feature are used to matrix represents a unique word in the lecturer answer key while each column in the matrix create an automatic essay scoring in e-learning system. represents each sentence of lecturer answer key. In addition, each cell in the matrix represents the number of times the occurrence of a unique word that exists in each sentence of lecturer answer key. 2.1 Pre-processing In the term frequency matrix for student answers also consists of rows and columns where each row in the matrix represents a unique word on the lecturer answer key while Pre-processing is an early stage in processing input data before entering the main stage each column in the matrix represents every student answers sentence then each cell on the process. Pre-processing consists of several stages. The pre-processing stages are performed matrix represents the number of occurrences of a unique word lecturer answer key is in in the system are case folding, lexical analysis, tokenization. Here is an explanation of the every student answer sentence. The weighting used in each matrix cell is the frequency of pre-processing steps performed as follows: the student's answer and the lecturer's answer key are the weighting of the term frequency i) Case Folding (tf). Case folding is the stage that changes all the letters in the corpus into lowercase. The The next stage is to calculate Singular Value Decomposition (SVD) on the term purpose of case folding is to uniform input data because the input can be lowercase and frequency matrix that has been formed in the previous stage is the term frequency matrix uppercase. for lecturer answer key and term frequency matrix for student answers. The purpose of ii) Lexical Analysis SVD is discover a new pattern or relationship “latent” between words and sentences Lexical analysis is the process of eliminating numbers, punctuation and characters which containing these words. The result of the SVD calculation process produces three matrices cannot be read by the system. components are two orthogonal matrices and one diagonal matrix. The factoring of SVD iii) Tokenization from a matrix A dimension txd is as follows: Tokenization is the process of cutting or separating words from a collection of text or T corpus. Atxd Utxn Snxn Vdxn (1)

Where:

2.2 Latent semantic analysis (LSA) A: term matrix sized txd. LSA stands for Latent Semantic Analysis is an information retrieval method that works on U: orthogonal matrix sized txn. a set of texts in which words can be represented along with their context in a matrix with S: diagonal matrix sized nxn. mathematical calculations. The formed matrix is a mapping of the presence or absence of a V: orthogonal matrix transpose sized dxn. word or term from a particular group of words or terms. In addition, the context used can be t: number of rows of the matrix A. a list of keywords, sentences, even paragraphs. d: number of columns of the matrix A. Latent Semantic Analysis (LSA) was proposed and patented in 1988 by Scott n: the rank value of the matrix A (≤ min(t, d)). Deerwester, Thomas Launder, and Susan Dumais [1]. This method is widely used in the The decomposed of matrix A produces three matrices of the matrix U, S, and VT. The field of Information Retrieval and Natural Language Processing. One example of the use of size or dimensions of three matrices generated in the SVD process are different. In the the LSA method is the assessment of essay answers. The LSA method uses the matrix U has the same row as the matrix A, but has fewer columns than the matrix A is n mathematical technique of linear algebra called Singular Value Decomposition (SVD) in column. In the matrix S has the size of n row and n columns. In the matrix VT has the same forming semantic matrix. column as the matrix A, but has fewer row sizes than matrix A or equal to column matrix A The semantic matrix that has been formed is a matrix that contains the pattern of is d row. relationship between words with the context in matching between the answer and answer

3 MATEC Web of Conferences 164, 01037 (2018) https://doi.org/10.1051/matecconf/201816401037 ICESTI 2017

Fig. 1. Singular value decomposition of matrix [2].

Based on Figure 1, the column of the matrix U is an orthogonal eigenvector of A × AT. In column matrix V is an eigenvector of orthogonal AT × A. In the matrix S is a diagonal matrix whose value is the root of the eigenvalues of the matrix U or V. After doing the SVD process, the next step is to reduce the matrix of SVD calculation result or smoothing process. The purpose of reducing the size or dimension of the matrix is to remove noise or words that are considered unimportant without eliminating the correlation or the relationship between words and sentences [4]. The reduction of the matrix dimension is done by reducing the dimension of the diagonal matrix containing the singular value of the second matrix result of the SVD process. When reducing the size of the diagonal matrix S containing the singular value, the first step is, choose a value of k. The value of k which can be selected in reducing the matrix size is k n. After that the value of k will be used on the matrix of the SVD calculation results by taking the firsts row data and the first column k data on the diagonal matrix containing the singular value [5]. Unselected dimension the singular value on the diagonal matrix S will be set to zero. The reduction in the diagonal matrix (S) also affects both orthogonal matrices U and V. The result of the reduced SVD matrices can be used for reconstructing the matrix by using a lower dimension (Ak). Reconstructed matrix (Ak) has the same size as the previous matrix (A). However, the reconstructed matrix (Ak) is a matrix with approximation of matrix A using a lower dimension which can show the hidden relationship between words in the sentence so that it can cover all important words and can ignore unimportant words that are not considered. In addition, the reconstructed matrix can estimate the hidden new relationship of a group of words or terms in a particular sentence so that the word or term group contained in the sentence may be close to or relate to each other in space k although the word or term is not ever present or appear on the sentence. Reconstruction matrix of SVD results can be seen in the Equation 2 and Figure 2. T Atxd  Ak U k Sk Vk (2)

4 MATEC Web of Conferences 164, 01037 (2018) https://doi.org/10.1051/matecconf/201816401037 ICESTI 2017

Where: Ak: reconstruction of term frequency matrix sized k. Uk: orthogonal matrix sized k. Sk: diagonal matrix sized k. Vk: orthogonal matrix transpose sized k. k: parameter of reduction dimension matrix.

Fig. 1. Singular value decomposition of matrix [2].

Based on Figure 1, the column of the matrix U is an orthogonal eigenvector of A × AT. In column matrix V is an eigenvector of orthogonal AT × A. In the matrix S is a diagonal matrix whose value is the root of the eigenvalues of the matrix U or V. After doing the SVD process, the next step is to reduce the matrix of SVD calculation result or smoothing process. The purpose of reducing the size or dimension of the matrix is to remove noise or Fig. 2. Reconstruction matrix of singular value decomposition. words that are considered unimportant without eliminating the correlation or the relationship between words and sentences [4]. The reduction of the matrix dimension is After the reconstruction of the student answer matrix and the lecturer answer key matrix done by reducing the dimension of the diagonal matrix containing the singular value of the has been formed, the next step is to form the answer vector and answer key vector of the second matrix result of the SVD process. matrix result from the student answer and the lecturer answer key who have been When reducing the size of the diagonal matrix S containing the singular value, the first reconstructed again using the lower dimension. Reconstruction using the lower dimension step is, choose a value of k. The value of k which can be selected in reducing the matrix will produce a new representation or relationship on the student answer and the lecturer size is k n. After that the value of k will be used on the matrix of the SVD calculation answer [6]. In this case the answer vector is the student answer sentence and the answer key results by taking the firsts row data and the first column k data on the diagonal matrix vector is the lecturer answer key sentence. The final stage of the automatic essay scoring containing the singular value [5]. Unselected dimension the singular value on the diagonal using LSA is to measure the similarity between the vector of students answers and the matrix S will be set to zero. The reduction in the diagonal matrix (S) also affects both vector of lecturer answer key. orthogonal matrices U and V. The result of the reduced SVD matrices can be used for reconstructing the matrix by 2.3 Cosine similarity using a lower dimension (Ak). Reconstructed matrix (Ak) has the same size as the previous matrix (A). However, the reconstructed matrix (Ak) is a matrix with approximation of Cosine Similarity is a technique for measuring the similarity of the angle cosine value matrix A using a lower dimension which can show the hidden relationship between words between document vectors and query vectors. In this case the document vector represents in the sentence so that it can cover all important words and can ignore unimportant words the sentence in the student answer while the query vector represents the sentence in the that are not considered. lecturer answer key. After the document vector (d) and query vector (q) has been formed, In addition, the reconstructed matrix can estimate the hidden new relationship of a the similarity between the document vector and the query vector can be calculated using the group of words or terms in a particular sentence so that the word or term group contained in Equation (3). the sentence may be close to or relate to each other in space k although the word or term is t w xw not ever present or appear on the sentence. Reconstruction matrix of SVD results can be  j1 qj dij seen in the Equation 2 and Figure 2. CosineSim q, d   (3) t 2 t 2     T w  w  Atxd Ak U k Sk Vk (2)  j1 dij  j1 qj

5 MATEC Web of Conferences 164, 01037 (2018) https://doi.org/10.1051/matecconf/201816401037 ICESTI 2017

Where: q; the student answer vector. d: the lecturer answer key vector.

2.4. N-gram Basically, the n-gram model is a probabilistic model designed by mathematicians in the early 20th century and later developed to predict the next item in the order of items. The item can be a sequence letter/character or a word sequence [7]. In the word generation process, n-gram consists of words sequence along n words in a sentence. Where in word, n- gram is can be used to extract pieces of n words from a sentence or paragraph. A n-gram of size 1 is called unigram, for n-gram of size 2 called bigram, for n-gram of size 3 called trigram, whereas for n-gram is a term when it has a size greater than 3. For example, the word sequence “perangkat keras komputer” (computer hardware), the word sequence in n- gram as follow. Unigram : perangkat, keras, komputer. (device, hard, computer) Bigram : perangkat keras, keras komputer. (hard device, computer hard) Trigram : perangkat keras komputer. (computer hardware)

2.5. Evaluation Evaluation of an automated essay scoring system is needed to assess how the Latent Semantic Analysis method works good in assessing or scoring the essay answers. This evaluation is done by comparing the judgments which generated by the system with human judgment, in this case the lecturer. After all data was already obtained, the next step will be analyzing mean absolute error and mean absolute percentage error result. The result of mean absolute error is the average difference between value from lecturer and systems. n x  y  i i MAE  i 1 (4) n Where: MAE : average value from lecturer and system. Xi : value from lecturer. Yi : value from system. n : numbers of data.

While the result of mean absolute percentage error is the average difference between value from lecturer and systems. Then the result of mean absolute percentage error calculation can also be measured the success rate of the automatic essay scoring system in assessing or scoring student essay answers.

n xi  yi i1 xi MAE  100 (5) n Where: MAPE : average percentage error value. Xi : value from lecturer. Yi : value from system. n : numbers of data.

6 MATEC Web of Conferences 164, 01037 (2018) https://doi.org/10.1051/matecconf/201816401037 ICESTI 2017

Where: 3 Planning and implementation q; the student answer vector. d: the lecturer answer key vector. 3.1. System plan

2.4. N-gram s se w en e nnn needs o e done n sse n s se s done rs deernn e urose nd use o e sse o e desned e desned Basically, the n-gram model is a probabilistic model designed by mathematicians in the sse s n eernn sse w uoc ess scorn eure e sse s u early 20th century and later developed to predict the next item in the order of items. The nd s nended or sudens nd ecurers n e ernn rocess were n s sse item can be a sequence letter/character or a word sequence [7]. In the word generation nnn sudens cn do e e en e ecurer n e or o esss we e process, n-gram consists of words sequence along n words in a sentence. Where in word, n- ecurer cn e e es o e suden n e or o ess wc en w e gram is can be used to extract pieces of n words from a sentence or paragraph. A n-gram of uoc ssessed e sse w e nswer e wo s een ncuded size 1 is called unigram, for n-gram of size 2 called bigram, for n-gram of size 3 called ecurer trigram, whereas for n-gram is a term when it has a size greater than 3. For example, the e n se o e uoc ess scorn sse s e reron se o e word sequence “perangkat keras komputer” (computer hardware), the word sequence in n- requred d suc s suden nswers quesons nd odes o ecurer nswers e gram as follow. conssn o quesons nd nswers e re en ro e oo scoern ouers Unigram : perangkat, keras, komputer. (device, hard, computer) 2010: Living in Digital World", websites, ppt and pdf about course material “introductory Bigram : perangkat keras, keras komputer. (hard device, computer hard) noron ecnoogy”. Trigram : perangkat keras komputer. (computer hardware) enwe e suden nswer d s coeced dsrun quesonnres connn quesons ou course er nroducor noron ecnoo o e 2.5. Evaluation sudens o e noron ecnoo cu runr ners n desnn uoc ess scorn sse s u o ce e desn o e sse n Evaluation of an automated essay scoring system is needed to assess how the Latent ssessn ess nswers en e coeced d used n e sse suc s suden nswers Semantic Analysis method works good in assessing or scoring the essay answers. This quesons nd odes o ecurer nswer e w e sored n e dse evaluation is done by comparing the judgments which generated by the system with human uoc ess scorn sse desned o recee nu n e or o quesons judgment, in this case the lecturer. After all data was already obtained, the next step will be nswer es nd nswers ess uesons nd nswers were oned ro e ecurer analyzing mean absolute error and mean absolute percentage error result. The result of we e ess nswers were oned ro e suden en e nswers e nd nswers mean absolute error is the average difference between value from lecturer and systems. w e done rerocessn er e sse w eror e sr ccuon n eween e sudens ess nswers w ec ode o ecurers nswer e usn en x  y  i i MAE  i 1 (4) enc nss were e es sr ues oned usn cosne n sr w e used o roduce e suden scores ouu Where: n e uoed ess scorn sse usn desned s sown n ure s MAE : average value from lecturer and system. sse runs rs nn e d enered e user conssn o ecurers nd Xi : value from lecturer. sudens ecoe sse nu s ecurer nswer e d nd suden nswer d nce e sse es nu ro e user e sse oes no e rerocessn Yi : value from system. n : numbers of data. se e rocess o ccun e en enc nss or e resu o e sudens nswer score While the result of mean absolute percentage error is the average difference between sed on e ow n ure ere s rerocessn se rerocessn s seres value from lecturer and systems. Then the result of mean absolute percentage error o ses erored o ne nu d eore enern e ne se n e re calculation can also be measured the success rate of the automatic essay scoring system in rocessn sed ere re seer ses erored suc s cse odn ec nss assessing or scoring student essay answers. nd oenon urer d s een rerocessn s w e nu or ener n e rocess o x  y n i i en senc nss or n ses erored n e rocess o en senc i1 xi nss or s e oron o er requenc r or nswer nd er requenc MAE  100 (5) n r or nswer e e rocess o orn er requenc r counn e nuer o words or ers er n ec senence n e r en e er Where: requenc r s een ored w e ccued or nur ue MAPE : average percentage error value. ecooson w roduce e rces nd e resu o rces X : value from lecturer. i ccuon w e reducon rocess wc en e used o reconsruc e r usn : value from system. Yi ower denson n : numbers of data.

7 MATEC Web of Conferences 164, 01037 (2018) https://doi.org/10.1051/matecconf/201816401037 ICESTI 2017

Fig. 3. Similarity answer key and answer using LSA.

n te formation of te term freuency matri of te answer, te set of terms or words used to calculate te freuency of words occurrence in eac sentence answer is a collection of terms or words derived from te preprocessing answers ey. Liewise, for te process of forming te term freuency matri of answer ey. f te reconstruction answer matri

8 MATEC Web of Conferences 164, 01037 (2018) https://doi.org/10.1051/matecconf/201816401037 ICESTI 2017

and answer ey matri as been formed, ten te net process is to form an answer vector and answer ey vector. e answer vector and te answer ey vector tat as been formed will be calculated by te value of similarity using osine imilarity metod were a vector formed is a representation of a sentence in te matri column. en te value used to evaluate eac sentence answer is te value wit te igest similarity between answers vector wit all te answers ey vector. n te process of assessing student answers, te value to be taen to be te output is te average value obtained from te sum of eac value of te igest similarity between te student answer sentences tat is compared wit all ey lecturer answer sentences divided by te number of sentences of student answers. en te average value of similarity tat as range 0 to 1 multiplied by value 100.

3.2. System implementation fter te design pase of te system as been designed ten te net stage is to mae software adustments wit te system design tat as been made before. e first ting to consider in te maing of tis system is te system needs consisting of ardware and software from te server. erver serves as a mediator during te process of data transfer using internet connection, on te manufacture of te system will use te osting server. e ardware used for system development used a computer wit erver computer specifications as a server. s for te client side can use a computer tat as a web browser lie internet eplorer, google crome, moilla firefo, etc. oftware used on te server tat is manager and L erver anagement tudio are used as central data storage server, atlab used to perform matematical calculations of linear algebra, and . and erl is used as a programming language bac end. fter all te ardware and software tat will be used in te manufacture is available, ten te creation of te system can be started by first setting up te database needed to store data before and after processing. e eisting database will be created using te L erver database server. fter te database is created, ten te net stage to do is create interface display for eac module tat as been designed previously. Display interface is made as simple as possible so tat users can use te eisting program easily. nce te eisting interface is created, it will mae te program code for te eisting interface so tat te interface is made to wor properly. f te eisting program code as been created, ten will be tested against eac module tat as been made. esting is made to be nown weter te eisting program as been running as epected or not. n addition, troug te tests made can also be nown weter te program is made tere is still an error or not. nce te interface as been completed, program coding will be made available for automatic essay scoring features wit te algoritm and te teoretical base described earlier. Wen te coding program as been completed, it will be tested against te features tat ave been made. eature testing is done to find out weter te program as been made output according to te designed. Fig. 3. Similarity answer key and answer using LSA.

n te formation of te term freuency matri of te answer, te set of terms or words 4 Result and discussion used to calculate te freuency of words occurrence in eac sentence answer is a collection of terms or words derived from te preprocessing answers ey. Liewise, for te process esting conducted consist of two stages is blac bo testing and automatic essay scoring of forming te term freuency matri of answer ey. f te reconstruction answer matri features testing. lac bo testing is done to test weter eac function in eac module as been able to function properly. n testing te automatic essay scoring feature is performed

9 MATEC Web of Conferences 164, 01037 (2018) https://doi.org/10.1051/matecconf/201816401037 ICESTI 2017

to determine weter te system created can assess te essay answers according wit te udgments made by umans.

4.1. Modules testing is testing is performed on all modules tat eist on te system created. utline, te modules tat eist in te system is divided into 2 main modules namely te module for student and modules for lecturer.

4.1.1 Automatic essay scoring modules testing e results of te testing conducted found tat all te functions of te eisting automatic essay scoring module ad been able to run according wit its function. ere is a picture sowing te test results from one of te automatic essay scoring modules.

Fig. 4. Testing results of the automatic essay scoring module when the student chooses the exam.

10 MATEC Web of Conferences 164, 01037 (2018) https://doi.org/10.1051/matecconf/201816401037 ICESTI 2017 to determine weter te system created can assess te essay answers according wit te udgments made by umans.

4.1. Modules testing is testing is performed on all modules tat eist on te system created. utline, te modules tat eist in te system is divided into 2 main modules namely te module for student and modules for lecturer.

4.1.1 Automatic essay scoring modules testing e results of te testing conducted found tat all te functions of te eisting automatic essay scoring module ad been able to run according wit its function. ere is a picture sowing te test results from one of te automatic essay scoring modules.

Fig. 5. Testing results of the automatic essay scoring module when the student submits the exam.

Fig. 4. Testing results of the automatic essay scoring module when the student chooses the exam.

11 MATEC Web of Conferences 164, 01037 (2018) https://doi.org/10.1051/matecconf/201816401037 ICESTI 2017

Fig. 6. Testing results of the automatic essay scoring module when the lecturer gives the value manually.

4.2. Automatic essay scoring feature testing e process of automated essay scoring system is done to find out weter te system created can assess te essay answers according wit te assessment conducted by uman or lecturers and find out ow muc te average level of accuracy generated by te automatic essay scoring system. e process of automated essay scoring system test is divided into two parts namely te testing process by te researcers and te testing process by te respondents.

4.2.1 Automatic essay scoring feature testing by researchers n te process of testing conducted by researcer, te initial process is to collect a set of uestions and answers, ten te data collected will be tested weter te automatic essay scoring system can assess te essay answers automatically according wit te assessment of umans or lecturers. fter tat te collected data will be validated by aculty of

12 MATEC Web of Conferences 164, 01037 (2018) https://doi.org/10.1051/matecconf/201816401037 ICESTI 2017

ecnology arumanagara niversity lecturers. esting conducted by te researcers tere are 2 parts are testing te perfect answer wit te same word order and testing te perfect answer wit different word order.

4.2.2 Automatic essay scoring feature testing by respondents n te process of testing by respondents, te testing begins by distributing uestionnaires to aculty of ecnology arumanagara niversity students consisting of uestions about te course material "ntroduction to nformation ecnology". otal respondents wo fill te uestionnaire as many as si people wile te total answers of students wo successfully collected 0 answers. efore te testing started, first te uestionnaire will be assessed manually by eperts wo are lecturers of aculty of ecnology arumanagara niversity. e following are some of te data for uestions on te uestionnaire: Table 1. ome of te data for uestions on te questionnaire. No Questions

1 Apakah yang dimaksud dengan perangkat keras? Wat is ardware 2 Jelaskan yang anda ketahui tentang internet? plain wat you now about te internet Jelaskan yang anda ketahui tentang pengertian dan fungsi processor pada komputer? plain wat you now about te meaning and function of te processor on te computer Jelaskan apa yang dimaksud dengan jaringan komputer beserta tujuannya? plain wat is meant by computer networ and te purpose Apa yang dimaksud dengan sistem operasi pada komputer ? Jelaskan beserta fungsinya? Wat is an operating system on a computer Describe along wit te function

Data in te form of uestionnaires tat ave been collected will be tested to find out ow muc te average mean absolute error between te assessment by te system wit te Fig. 6. Testing results of the automatic essay scoring module when the lecturer gives the value manual assessment by umans and te average mean absolute percentage error of te manually. automatic essay scoring for assessing essay answers. en wit te average level of error percentage can also be measured ow muc te system accuracy level. 4.2. Automatic essay scoring feature testing esting toward te automatic essay scoring feature is conducted troug five scenarios, namely: e process of automated essay scoring system is done to find out weter te system 1. L nigram. created can assess te essay answers according wit te assessment conducted by uman or 2. L igram. lecturers and find out ow muc te average level of accuracy generated by te automatic .L rigram. essay scoring system. e process of automated essay scoring system test is divided into two parts namely te testing process by te researcers and te testing process by te . L nigram igram. respondents. 5. L nigram igram rigram.

ummary results of testing scenarios conducted by respondents can be seen in te 4.2.1 Automatic essay scoring feature testing by researchers able 2. n te process of testing conducted by researcer, te initial process is to collect a set of uestions and answers, ten te data collected will be tested weter te automatic essay scoring system can assess te essay answers automatically according wit te assessment of umans or lecturers. fter tat te collected data will be validated by aculty of

13 MATEC Web of Conferences 164, 01037 (2018) https://doi.org/10.1051/matecconf/201816401037 ICESTI 2017

Table 2. e results of testing scenarios by respondents.

System Score Number Lecturer of Answer Score Scenario Scenario Scenario Scenario Scenario 1. 2. 3. 4. 5. 1. 0 . . 2.1 2.2 .

2. 0 1.1 1. . . 2.

. 0 1. . . . 1.0

. 0 2. 20.21 1.1 1. .

. 0 . 2. 0 2. 2.

.. … … … … … …

0. 100 100 2. .2 1. .

4.2.3 Evaluation of automatic essay scoring feature by researchers ased on te results of testing conducted by researcers, te system as been able to assess te essay answers wit te range 1 to 100. cores tat can be generated system is unigram score, bigram score, trigram score, unigram bigram score, and unigram bigram trigram score. ere is one eample of testing conducted by researcers. uestions: Apa yang dimaksud dengan perangkat keras Wat is ardware

nswer ey: Perangkat keras merupakan bagian fisik dari komputer yang berfungsi untuk memberi masukan, menampilkan dan mengolah keluaran. Perangkat keras ini digunakan oleh sistem untuk menjalankan suatu perintah yang telah diprogramkan artinya berhubungan dengan perangkat lunak. Perangkat keras ini dibedakan menjadi empat jenis yaitu perangkat input, pemrosesan, output dan penyimpanan. ardware is te pysical part of te computer tat serves to provide input, display and process te output. is ardware is used by te system to run a command tat as been programmed wic means it is related to te software. is ardware is divided into four types consisting of input devices, processing, output and storage.

4.2.3.1 Testing the perfect answer with the same word order nswer: Perangkat keras merupakan bagian fisik dari komputer yang berfungsi untuk memberi masukan, menampilkan dan mengolah keluaran. Perangkat keras ini digunakan oleh sistem untuk menjalankan suatu perintah yang telah diprogramkan artinya berhubungan dengan perangkat lunak. Perangkat keras ini dibedakan menjadi empat jenis yaitu perangkat input, pemrosesan, output dan penyimpanan.

14 MATEC Web of Conferences 164, 01037 (2018) https://doi.org/10.1051/matecconf/201816401037 ICESTI 2017

Table 2. e results of testing scenarios by respondents. System Score Number Lecturer of Answer Score Scenario Scenario Scenario Scenario Scenario 1. 2. 3. 4. 5. 1. 0 . . 2.1 2.2 . 2. 0 1.1 1. . . 2.

. 0 1. . . . 1.0 4.2.3.2 Testing the perfect answer with different word order . 0 2. 20.21 1.1 1. . Keras perangkat merupakan bagian fisik dari komputer yang berfungsi untuk masukan . 0 . 2. 0 2. 2. memberi, menampilkan dan mengolah keluaran. Keras perangkat ini digunakan oleh sistem untuk menjalankan suatu perintah yang telah diprogramkan artinya berhubungan dengan .. … … … … … … lunak perangkat. Keras perangkat ini dibedakan menjadi empat jenis yaitu perangkat input, pemrosesan, output dan penyimpanan 0. 100 100 2. .2 1. . 4.2.3 Evaluation of automatic essay scoring feature by researchers ased on te results of testing conducted by researcers, te system as been able to assess te essay answers wit te range 1 to 100. cores tat can be generated system is unigram score, bigram score, trigram score, unigram bigram score, and unigram bigram trigram score. ere is one eample of testing conducted by researcers. uestions: Apa yang dimaksud dengan perangkat keras 4.2.4 Evaluation of automatic essay scoring feature by respondents Wat is ardware nswer ey: Perangkat keras merupakan bagian fisik dari komputer yang berfungsi untuk memberi masukan, menampilkan dan mengolah keluaran. Perangkat keras ini digunakan oleh sistem untuk menjalankan suatu perintah yang telah diprogramkan artinya berhubungan dengan Table 3. perangkat lunak. Perangkat keras ini dibedakan menjadi empat jenis yaitu perangkat input, pemrosesan, output dan penyimpanan. No Testing Scenarios LSA MAE MAPE ardware is te pysical part of te computer tat serves to provide input, display and process te output. is ardware is used by te system to run a command tat as been programmed wic means it is related to te software. is ardware is divided into four types consisting of input devices, processing, output and storage.

4.2.3.1 Testing the perfect answer with the same word order nswer: Perangkat keras merupakan bagian fisik dari komputer yang berfungsi untuk memberi masukan, menampilkan dan mengolah keluaran. Perangkat keras ini digunakan oleh sistem untuk menjalankan suatu perintah yang telah diprogramkan artinya berhubungan dengan perangkat lunak. Perangkat keras ini dibedakan menjadi empat jenis yaitu perangkat input, pemrosesan, output dan penyimpanan.

15 MATEC Web of Conferences 164, 01037 (2018) https://doi.org/10.1051/matecconf/201816401037 ICESTI 2017

Fig. 7. Graph the average rate of system accuracy percentage by respondents.

5 Conclusion

16 MATEC Web of Conferences 164, 01037 (2018) https://doi.org/10.1051/matecconf/201816401037 ICESTI 2017

References 1. S. Deerwester, S.T. Dumais, R. Harshman, Indexing by latent semantic analysis [Online] from http://lsa.colorado.edu/papers/JASIS.lsi.90.pdf (1992) [Accessed on 17 September 2017] 2. S. Dikli. JTLA, 5,1 (2006). https://files.eric.ed.gov/fulltext/EJ843855.pdf 3. M.M. Islam, A.S.M. Latiful Hoque. JCP, 7,3:616–626 (2012). http://www.jcomputers.us/index.php?m=content&c=index&a=show&catid=125&id=1 Fig. 7. Graph the average rate of system accuracy percentage by respondents. 969

4. R.H Gusmita, R. Manurung, Penerapan latent semantic analysis untuk menentukkan kesamaan makna antara kata dalam Bahasa Inggris dan Bahasa Indonesia [Applied latent semantic analysis to determine same meaning between words in English and Indonesian] [Online] from http://digilib.mercubuana.ac.id/manager/t!@file_artikel_abstrak/Isi_Artikel_39179235 2024.pdf. (2008) [Accessed on 25 January 2018]. [in Bahasa Indonesia] 5. Ch.A. Kumar, S. Srinivas, CIT, 17,3:259–264 (2009). http://cit.fer.hr/index.php/CIT/article/view/1751 6. R.B.A. Pradana, Z.K.A. Baizal, Y. Firdaus. Automatic essay grading system menggunakan metode latent semantic analysis, Seminar Nasional Aplikasi Teknologi Informasi (Yogyakarta, Indonesia 2011). Seminar Nasional Aplikasi Teknologi Informasi (SNATI):E78-86 (2011). http://journal.uii.ac.id/index.php/Snati/article/view/2217 7. S.A. Sugianto, Liliana, S. Rostianingsih. Jurnal Infra, 1,2:119–124 (2013). https://www.neliti.com/id/publications/105718/pembuatan-aplikasi-predictive-text- menggunakan-metode-n-gram-based 8. Munir, L.S. Riza, A. Mulyadi. IJTRD, 3,6:403–407 (2016). http://www.ijtrd.com/papers/IJTRD5412.pdf 9. N.T. Thomas, A. Kumar, K. Bijlani. Procedia Computer Science, 58:257–264 (2015). https://ac.els-cdn.com/S1877050915021304/1-s2.0-S1877050915021304- main.pdf?_tid=29855a6a-fed1-11e7-9742- 00000aab0f6b&acdnat=1516556184_8c82910250d80802478e13aedd6c965f 10. T. Kakkonen, N. Myller, E. Sutinen. Applying part-of-speech enhanced LSA to automate essay grading. Proceedings of the 4th IEEE International Conference on Information Technology: Research and Education (Tel Aviv, Israel, 2006). https://arxiv.org/abs/cs/0610118

5 Conclusion

17