South Florida Journal of Development, Miami, v.1, n.2, 22-36, apr./jun. 2020. ISSN 2675-5459

Pattern Recognition in Scientific Social Networks: A Systematic

DOI: 10.46932/sfjdv1n2-001

Received in: February 1st, 2020 Accepted in: February 29th, 2020

Tobias Ribeiro Sombra Master in Information Science from the Federal University of Rio de Janeiro Institution: Brazilian Institute of Information, Science and Technology (IBICT) - Federal University of Rio de Janeiro Address: 455 Lauro Muller Street, Rio de Janeiro, RJ, Brazil E-mail: [email protected]

Rose Marie Santini PhD in Information Sciences at the Federal University of Rio de Janeiro Address: 455 Lauro Muller Street, Rio de Janeiro, RJ, Brazil E-mail: [email protected]

Emerson Cordeiro Morais PhD in Systems Engineering and Computing from the Federal University of Rio de Janeiro Institution: Federal Rural University of the Amazon Address: 2501 Presidente Tancredo Neves Avenue, Belém, PA, Brazil E-mail: [email protected]

Walmir Oliveira Couto PhD student in Informatics at the Federal University of Paraná Institution: Federal Rural University of Amazônia Address: 2501 Presidente Tancredo Neves Avenue, Belém, PA, Brazil E-mail: [email protected]

Alex de Jesus Zissou PhD student in Agronomy at the Federal Rural University of Amazônia Institution: Federal Rural University of Amazônia Address: 2501 Presidente Tancredo Neves Avenue, Belém, PA, Brazil E-mail: [email protected]

Pedro Silvestre da Silva Campos PhD in Agrarian Sciences from the Federal Rural University of Amazônia Institution: Federal Rural University of Amazônia Address: 2501 Presidente Tancredo Neves Avenue, Belém, PA, Brazil E-mail: [email protected]

José Felipe Souza de Almeida PhD in Agronomy from the Federal Rural University of Amazônia Institution: Federal Rural University of Amazônia Address: 2501 Presidente Tancredo Neves Avenue, Belém, PA, Brazil E-mail: [email protected]

22

South Florida Journal of Development, Miami, v.1, n.2, 22-36, apr./jun. 2020. ISSN 2675-5459

Otavio Andre Chase PhD in Electrical Engineering and Energy Systems from the Federal University of Pará Senior Member of IEEE Institution: Federal Rural University of Amazônia Address: 2501 Presidente Tancredo Neves Avenue, Belém, PA, Brazil E-mail: [email protected]

ABSTRACT This article presents an investigative work, carrying out a Systematic Literature Review to find the quantity of studies that involve Pattern Recognition and Scientific Social Networks Online. The search was also expanded to find metrics with Pattern Recognition. The intention to find the quantitative arose due to the personal need to find references that involve this study to develop related scientific works. For this, 8 databases were used to carry out this research, which are: Library and Information Science Abstracts (LISA), Library, Information Science And Technology Abstracts (LISTA), Sociological Abstracts, SocINDEX, IEEE, , and SAGE Journals Online. After a series of data treatments, the results found indicate that the study is considered recent and that it lacks a research schedule for consolidation.

Keywords: Scientific Social Networks, , Naïve Bayes, Machine Learning

1 INTRODUCTION This article presents a Systematic Literature Review that aims to find research that involves the use of Pattern Recognition in Scientific Online Social Networks. Interest in this study arose due to a publication in the Nature entitled Online Colaboration: Scientists and Social Networks (NOORDEN, 2014). This article points to a growth in the interest of scientists in Online Scientific Social Networks. Research shows that networks such as Research Gate and academia.edu are widely used to maintain a profile in case someone wants to get in touch, which suggests that many researchers consider their profiles a way to increase their professional presence online. Other popular options involve publishing work-related content, finding related peers, monitoring metrics and finding recommended search documents. Taking into account the growth in the use of Scientific Social Networks, it was thought to evaluate whether there are studies involving the Recognition of Standards in various contexts, including as an alternative metric for method application in scientific works. The intention is to verify the quantity of documents that address Recognition of Standards in Scientific Online Social Networks.

2 LITERATURE REVIEW 2.1 BIBLIOMETRY, WEBOMETRY AND ALTMETRY: DEFINITION AND TRANSITION Bibliometry, according to Fenner (2014, p. 180), "is a major subdiscipline of that measures the impact of scientific publications. The analysis of citations is the most popular application of

23

South Florida Journal of Development, Miami, v.1, n.2, 22-36, apr./jun. 2020. ISSN 2675-5459

bibliometry". In this case, bibliometry is a metric that seeks to evaluate the impact, mainly, through citations. Webometry, according to Björneborn (2004, p. 1217) is "the study of the quantitative aspects of the construction and use of information resources, structures and technologies on the web, approaching bibliometric and infometric approaches". The transition from Webometry to Altmetry came from a need to use new metrics. According to Butler (2017), "[...] the explosion of social media, together with the development of professional and scientific popular websites and blogs, has led to the need for alternative metrics, known as Altmetria, to quantify a broader impact of research". In this case, Altmetria arose from the need to look for other metrics, taking into account the great growth of social media. Butler (2017) has the following definition of Altmetry:

Altmetria uses web-based metrics to assess the greatest impact of academic material, with an emphasis on social media as data sources. The term article-levelmetrics refers to the types of data collected, which include views, downloads, clicks, notes, tweets, shares, recommendations, tags, posts, trackbacks, discussions, bookmarks and comments; not just quotes from an article in a database or by an editor. Butler (2017, p. 226).

With this, it is possible to realize that Altmetria seeks to expand the field of research to assess the impact. According to Priem et al. (2010), altmetry has more indicators than traditional asmetrics. It seeks the impact not only in citations, but also in social network commentaries, blogs, etc.

2.2 SYSTEMATIC REVIEW OF LITERATURE The definition of Systematic Review is highly widespread in the literature and presents similar concepts. Biolchiniet et al. (2007) says that the systematic review is developed in a formal manner and follows a strict sequence of methodological steps. It also states that it presents a central core that expresses a set of concepts and terms involved in a central research issue. Kitchenham (2004) states that Systematic Literature Review is "a means of identifying, evaluating, and interpreting all relevant research available for a particular research question, or topic in an area, or phenomenon of interest. This indicates that systematic review can be developed for any research as an evaluative method. Stematic Review is widely used in different contexts. As examples, we can cite the works of Khosravipour and Khanlari (2020), Fisher et. al (2020) and Le et. al (2018). The former performs a systematic review to investigate the association between exposure to noise in road traffic and myocardial infarction, the second uses the method to identify factors that contribute to the gender experiences of Australian undergraduate students in STEM subjects (Science, Technology, Engineering, Mathematics)

24

South Florida Journal of Development, Miami, v.1, n.2, 22-36, apr./jun. 2020. ISSN 2675-5459

and the third sought a systematic review to identify and discuss future directions of decision-making in construction supply chain management.

2.3 PATTERN RECOGNITION According to Rocha et. al (2008), pattern recognition is a part of a whole process called Knowledge Discovery in Databases (DCBD). This method studies the ability to convert raw data into high-level knowledge. The process takes a number of sequential steps, which are: 1. Selection of the dataset according to the objectives of the DCBD process; 2. Pre-processing which includes data integrity assessment, noise reduction, incomplete information and techniques to reduce the number of variables; 3. Data mining, which aims to adapt the selected data to the purpose of DCBD, the choice of algorithm for data classification and the search for knowledge standards; 4. Interpretation, verification and evaluation of results obtained; 5. Use and management of acquired knowledge.

3 METHODOLOGY The method consists of processes that were thought to carry out the Systematic Literature Review, which aims to identify the state of the art of the research theme proposed in this work. The purpose is to search and analyze existing research in the area of Pattern Recognition applied to Scientific Online Social Networks. Therefore, the RSL was performed in 3 sequential steps: 1 - literature search 2 - The cleaning and processing of data 3 - The Review Strategy To collect the documents, 8 databases were selected as sources. The databases used were: Library and Information Science Abstracts (LISA), Library, Information Science and Technology Abstracts (LISTA), Sociological Abstracts, SocINDEX, IEEE, Web of Science, Scopus e SAGE Journals Online. Below is the justification for the use of the databases cited: • LISA: It is a database of references and summaries intended for library professionals, information science and other specialists in related fields. Main areas of coverage include: information management; information technology; internet; knowledge management; library science; libraries and archives; library management; library use and users; information retrieval. It has access to more than 480 journal titles with availability ranging from 1966 to the present. • LISTA: The LISTA database is composed of 493 periodicals, more than 286 of which are full text. Many of these titles are peer reviewed. It also offers access to , conference publications and

25

South Florida Journal of Development, Miami, v.1, n.2, 22-36, apr./jun. 2020. ISSN 2675-5459

reports. Access to abstracts varies from 1964 to the present and full text content from 1965 to the present. The main areas of coverage include: , cataloguing, classification, information management, library science and information retrieval. • Sociological Abstracts: The database offers specialized references and summaries in Sociology and related disciplines of social, human and behavioral sciences. The main areas of coverage include: culture and social structure; family and marriage; history and theory of sociology; organizational sociology; political sociology; poverty and homelessness; race and ethnicity; social change and economic development; social control; sociology in health and medicine; sociology of education; social control. It allows access to more than 5,350 journal titles, in addition to books, reports, congress proceedings, translations and restricted circulation documents. The availability of access varies from 1900 to the present. • SocINDEX: It is a database that offers access to more than 4,000 peer-reviewed journals in the area of Social Sciences, with more than 809 in full text, as well as titles of books and other materials such as educational resources, newspapers, , reports and congress proceedings. The availability of access to abstracts and references varies from 1905 to the present and the availability of access to full texts varies from 1908 to the present. • IEEE: The database offers about 463 journals, with access availability ranging from 1,872 to the present; over 17,874 conferences with access availability ranging from 1951 to the present and over 2,895 technical standards with access availability ranging from 1949 to the present. It covers the area of electrical, electronic and computing fields and related areas of science and technology. • Web of Science: The database allows access to references and summaries of all areas of knowledge. Through the Web of Science, tools are available for the analysis of citations, references and h index, allowing bibliometric analysis. It covers approximately 12,000 periodicals. The subscription of this content to Capes' periodicals portal offers the possibility of consulting 5 collections: 1) Science Expanded (SCI-EXPANDED) – with access availability from 1945 until the present; 2) Social Sciences Citation Index (SSCI) – with access available from 1956 to the present day; 3) Conference Proceedings Citation Index – Science (CPCI-S) – with access available from 1991 to the present; 4) Arts&HumanitiesCitation Index (A&HCI) – with access available from 1975 until the present; 5) Conference Proceedings Citation Index - Social Science&Humanities (CPCI-SSH) - with access available from 1991 to the present. • Scopus: The Scopus database indexes peer-reviewed academic titles, titles, conference proceedings, trade publications, series, scientific content web pages (gathered in Scirus) and office patents. It has functions to support the analysis of results (bibliometrics) such as

26

South Florida Journal of Development, Miami, v.1, n.2, 22-36, apr./jun. 2020. ISSN 2675-5459

identification of authors and affiliations, analysis of citations, analysis of publications and index h. It covers the areas of Biological Sciences, Health Sciences, Physical Sciences and Social Sciences. The access period is from 1823 until today. • SAGE Journals Online: The database offers access to 608 journal titles in the areas of Applied Social Sciences, Human Sciences and Medicine. The availability of access to the full text varies from January 1999 to the present. It is important to inform that all the justifications for the mentioned above databases were collected in the capes periodicals portal, including only the name of a certain database in the term area and using the expression "All areas of knowledge" in the field of knowledge. The only exception is the LIST, which was found on the ebsco website. The direct link will be available in the bibliographical references of this article. After the selection of the databases, the search strategy was thought of. In general, it was based on a set of terms selected for two thematic axes: the first "Scientific Social Networks" and the second "Recognition of Standards". The term Systematic Review was used separately, just to check if there are other researches on the subject based on the Systematic Review method. Table 1 shows all terms separated by thematic axes used in the search.

Quadro 1 - Terms used for thematic axis-separated search. Scientific Social Networks Pattern Recognition Terms Strings Terms Strings Scientific Social Networks Scien* Social Network* Machine Learning Machine Learning Reference Manager Reference Manager PatternRecognition PatternRecognition Google Scholar Artificial Intelligence Artificial Intelligence Mendeley Mendeley Decision Trees Decision Trees ResearcherID ResearcherID Neural Networks Neural Network* Orcid Orcid Bayesian Networks Bayesian Network* academia.edu academia.edu K-NearestNeighbor K-NearestNeighbor Search Microsoft Academic Search GeneticAlgorithms GeneticAlgorithms ResearchGate ResearchGate K-NN K-NN Altmetric* Mining Mining Webmetrics Webmetric* Scientia.net Scientia.net

When observing Table 1, it is important to comment on some terms used in the "Scientific Social Networks" axis. Altmetrics and Webmetrics were used to find papers on altmetry or webmetry and Pattern Recognition. Scientia.net was included due to the latest systematic review research conducted by Sombra et. al (2020) that it was possible to find studies involving this Scientific Social Network along with Pattern Recognition methods. The other terms used to compose the axis of Scientific Social Networks were chosen based on the results presented in the research conducted by Nature magazine that points to the growth of interest and habits of use of these Social Networks by scientists in the world.

27

South Florida Journal of Development, Miami, v.1, n.2, 22-36, apr./jun. 2020. ISSN 2675-5459

The database queries were based on two combinations of terms using the axes presented in Table 1. Combination 1 contains the terms of both axes, while in Combination 2 the term Systematic Review was added for consultation in order to find out whether there are systematic review studies involving the central research question in this article. Table 2 shows the pattern of term combinations.

Table 2 - Standard structure of combinations used in databases.

Combinations - Expressions

“Scien* Social Network*” OR “Reference Manager” OR Mendeley OR “Google Scholar” OR “Research Gate” OR ResearcherID OR OR academia.edu OR “Microsoft Academic Search” OR “webmetric*” Combination 1 AND OR“altmetric*” “Machine Learning” OR “Artificial Intelligence” OR “Pattern Recognition” OR “Decision Trees” OR “Neural Network*” OR “Bayesian Network*” OR “K-Nearest Neighbor” OR K-NN OR “Genetic Algorithm*” OR Mining “Scien* Social Network*” OR “Reference Manager” OR Mendeley OR “Google Scholar” OR “Research Gate” OR ResearcherID OR orcid OR academia.edu OR “Microsoft Academic Search” OR “webmetric*” OR “altmetric*” Combination 2 AND “Machine Learning” OR “Artificial Intelligence” OR “Pattern Recognition” OR “Decision Trees” OR “Neural Network*” OR “Bayesian Network*” OR “K-Nearest Neighbor” OR K-NN OR “Genetic Algorithm*” OR Mining “Systematic Review”

The standards presented in Table 2 were used for combinations in all the databases mentioned above, but some modifications were made for IEEE and Scopus. In IEEE there is a limit of terms to be used in a single query (15 terms in total) while in Scopus there is a limit of characters to be inserted in a search. Therefore, in order to be able to search these databases using all the terms listed, it was necessary to break up the Pattern Recognition axis into 4 groups to reduce the number of characters for Scopus and decrease the number of terms in a search for IEEE. Therefore, the search treatment was differentiated for both databases. Table 3 shows the division of the Pattern Recognition axis into groups for IEEE.

Table 3 - Division of the Pattern Recognition Axis into groups at IEEE. Groups Strings “Machine Learning”, “Artificial Intelligence”, “Pattern Group 1 Recognition” Group 2 “Decision Trees”, “Neural Network*”, K-NN Group 3 “Bayesian Network*”, “K-Nearest Neighbor”, Mining Group 4 “GeneticAlgorithm*”

For Scopus, the treatment of group division has been differentiated due to the limited characters the database has. Table 4 shows the division of the Pattern Recognition axis into groups for Scopus.

28

South Florida Journal of Development, Miami, v.1, n.2, 22-36, apr./jun. 2020. ISSN 2675-5459

Table 4 - Division of the Pattern Recognition axis into groups at Scopus. Group Strings Group 1 “Machine Learning” Group 2 “Artificial Intelligence” Group 3 “PatternRecognition” Group 4 “Decision Trees” Group 5 “Neural Network*” Group 6 “Bayesian Network*” Group 7 “K-NearestNeighbor” Group 8 K-NN Group 9 “GeneticAlgorithm*” Group 10 Mining

Looking at Table 4, it can be seen that the number of groups at Scopus was higher compared to the IEEE precisely because of the character limit presented in the database. All the terms in the Scientific Social Networks axis practically filled the maximum character limit at Scopus. Another difference that can be cited involving IEEE and Scopus in comparison with the other databases used for document collection is in relation to the term Scientia.net. This term was used separately from the others for the case of IEEE and Scopus, while for the other databases this was not necessary. In this case, only the term Scientia.net was used, followed by the terms of the Pattern Recognition axis. Another difference that can be said is in the IEEE database: there was the insertion of the expression "Document Title" in the Strings to indicate that the results should present the terms only in the title. Figure 1 provides an example of the use of this expression in IEEE. Image generated on May 25th, 2017.

Figure 1 – IEEE query example using Document Title.

29

South Florida Journal of Development, Miami, v.1, n.2, 22-36, apr./jun. 2020. ISSN 2675-5459

All the procedures cited were essential for the systematic collection of documents. After this process, it was necessary to perform a treatment and cleaning of the data. The first step was to remove the repeated documents. After this step, criteria for assessing the relevance of the article in relation to the main RSL issue were defined and based on the following questions: 1 - Is it a Systematic Review? 2 - Does the work deal with Scientific Social Networks as the main object of research? 3 - Does the work deal with metrics (altmetry and webometry)? 4 - Does the work address the issue of Recognition of Standards? For an article to be included in the literature selection, it was necessary that, as a minimum, the answer be "yes" for the set of questions 2 and 4 or 3 and 4. In other words, the article should at least address Scientific Social Networks and Recognition of Standards or Metrics and Recognition of Standards. Question 1, on Systematic Review, was not an exclusion criterion. This question was used to identify and also separate studies that represent a systematic review of literature on the same theme. An article is not considered relevant for research when, at a minimum, the answer to questions 2, 3 or 4 is "no", that is, if the article does not address Scientific Social Networks, metrics (altmetry and webometry) or Pattern Recognition, it will automatically be disregarded for analysis. The selection criteria for these articles were based on the reading of the title and . With the cleaning of the data completed, the next step was to check the articles and see the final result returned in the literature search in the mentioned databases. The next section will consist of presenting the results obtained from this Systematic Review and the total number of documents found.

4 RESULTS In this Section the results of all searches that returned documents will be presented. In the databases: LISA, LISTA, SAGEJournals Online, SocINDEX, Sociological Abstracts e Web of Science, a total of two queries were made for each one, according to the pattern of combinations that were mentioned in the previous Section. For IEEE, a total of 10 combinations were made, while for Scopus it was 22. In this case, summing up all the combinations made in the databases, we have a total of 44 queries. Table 1 shows the total number of documents found by search only of the set of terms that presented some result. All searches were conducted on March 2, 2017.

30

South Florida Journal of Development, Miami, v.1, n.2, 22-36, apr./jun. 2020. ISSN 2675-5459

Table 1 - Total of articles found by search. Total os Terms Articles “Scien* Social Network*” OR “Reference Manager” OR Mendeley OR “Google Scholar” OR “Research Gate” OR Researcher ID OR orcid OR academia.edu OR “Microsoft Academic Reaserch 1 Search” OR “webmetric*” OR “altmetric*” OR Scientia.net AND “Machine Learning” OR 1 (LISA) “Artificial Intelligence” OR “Pattern Recognition” OR “Decision Trees” OR “Neural Network*” OR “Bayesian Network*” OR “K-Nearest Neighbor” OR K-NN OR “Genetic Algorithm*” OR Mining “Scien* Social Network*” OR “Reference Manager” OR Mendeley OR “Google Scholar” OR “Research Gate” OR Researcher ID OR orcid OR academia.edu OR “Microsoft Academic Reaserch 2 Search” OR “webmetric*” OR “altmetric*” OR Scientia.net AND “Machine Learning” OR 1 (LISTA) “Artificial Intelligence” OR “Pattern Recognition” OR “Decision Trees” OR “Neural Network*” OR “Bayesian Network*” OR “K-Nearest Neighbor” OR K-NN OR “Genetic Algorithm*” OR Mining “Scien* Social Network*” OR “Reference Manager” OR Mendeley OR “Google Scholar” OR “Research Gate” OR Researcher ID OR orcid OR academia.edu OR “Microsoft Academic Reaserch 3 Search” OR “webmetric*” OR “altmetric*” OR Scientia.net AND “Machine Learning” OR (Web Of 6 “Artificial Intelligence” OR “Pattern Recognition” OR “Decision Trees” OR “Neural Science) Network*” OR “Bayesian Network*” OR “K-Nearest Neighbor” OR K-NN OR “Genetic Algorithm*” OR Mining ("Document Title":“Scien* Social Network*” OR "Document Title": “Reference Manager” OR "Document Title":Mendeley OR "Document Title": “Google Scholar” OR "Document Title": Reaserch 4 “Research Gate” OR "Document Title": Researcher ID OR "Document Title": orcid OR 2 (IEEE) "Document Title":academia.edu OR "Document Title": “Microsoft Academic Search” OR "Document Title":“webmetric*” OR "Document Title":“altmetric*”) AND ("Document Title": “Decision Trees” OR "Document Title": “Neural Network*” OR "Document Title”: K-NN) “Scien* Social Network*” OR “Reference Manager” OR Mendeley OR “Google Scholar” OR Reaserch 5 “Research Gate” OR Researcher ID OR orcid OR academia.edu OR “Microsoft Academic 1 (Scopus) Search” OR “webmetric*” OR “altmetric*” AND “Decision Trees” “Scien* Social Network*” OR “Reference Manager” OR Mendeley OR “Google Scholar” OR Reaserch 6 “Research Gate” OR Researcher ID OR orcid OR academia.edu OR “Microsoft Academic 1 (Scopus) Search” OR “webmetric*” OR “altmetric*” AND “Neural Network*” “Scien* Social Network*” OR “Reference Manager” OR Mendeley OR “Google Scholar” OR Reaserch 7 “Research Gate” OR Researcher ID OR orcid OR academia.edu OR “Microsoft Academic 4 (Scopus) Search” OR “webmetric*” OR “altmetric*” AND Mining Scientia.net AND “Machine Learning” OR “Artificial Intelligence” OR “Pattern Recognition” Reaserch 8 OR “Decision Trees” OR “Neural Network*” OR “Bayesian Network*” OR “K-Nearest 1 (Scopus) Neighbor” OR K-NN OR “Genetic Algorithm*” OR Mining

As you can see, of the 44 surveys conducted, only 8 returned results. Using the inclusion and exclusion criteria mentioned in the previous section, 2 articles were found in IEEE and both were discarded in the treatment and cleaning process of the data mentioned in the previous section, for not dealing with Recognition of Standards and metrics applied to Scientific Social Networks. In Web Of Science, 7 articles were found, of which 4 were selected during data treatment and cleaning, 3 were not considered relevant and one of them was repeated (the repeated article was discarded because the search in other databases returned the same article). At Scopus, 7 articles were also found, one of which was considered pertinent, two of these articles were discarded and 5 were repeated in the search in other databases. In this case, for Scopus, some articles selected for analysis also came repeated from other databases. In the databases LISA e LISTA presented only 1 relevant result, which was repeated in other databases. SocIndex, SAGE Journals Online and Sociological Abstracts did not return pertinent results and none of the databases used found documents that present Systematic Review involving Scientific Social Networks and Metrics with Pattern Recognition. In addition, the additional article found in the research of Sombra et. al (2020) that makes applications in Scientia.net was included in the research.

31

South Florida Journal of Development, Miami, v.1, n.2, 22-36, apr./jun. 2020. ISSN 2675-5459

In short, only 5 documents were selected as relevant to the central search object using the predefined and previously presented inclusion and exclusion criteria. Figure 2 summarizes the results found in the databases after cleaning and processing the data.

Figure 2 - Result of the data collected in the databases.

Based on this information, one can consider that Scientific Standards Recognition and Social Networking studies is a very new field of research that presents an urgent research agenda for the area of information science and computing. Looking at the graph above, it can be seen that only 5 articles were considered for the search after the removal of repeated documents (marked in red in Figure 2). This indicates that there are very few studies in this area on Scientific Social Networks in relation to the Recognition of Standards. The articles found in the databases are from Ortega (2015), Alheyasat (2016), Huang and Yuan (2012), Lima and Machado (2011) and Lima, Machado and Lopes (2015). The first article, entitled Differences and evolution of scholarlyimpact in Google SholarCitations profiles: An application of Decision Trees, works with the use of Decision Trees to analyze the production and impact of more than 3000 profiles collected from the Google Scholar Citations in order to identify the segments (classified by gender, academic posts and disciplines) with the greatest success in terms of scientific impact. Decision Trees also helped to identify and explore the attributes that characterize the scientific impact by dividing the profiles during the process. The second article is called Investigation and Analysis of Research Gate User’s Activitiesusing Neural Networks and deals with a study involving an investigation of the activities

32

South Florida Journal of Development, Miami, v.1, n.2, 22-36, apr./jun. 2020. ISSN 2675-5459

of a sample of one million Research Gate (RG) users using the Neural Networks method. The objective of the work is to identify, with the sample collected in the Research Gate and using the Neural Networks, some correlation between the data of the users' profiles and the number of their followers. This correlation is expected to be through the publication of academic papers, in addition to the number of citations and views of a given profile. The third article named Mining Google Scholar Citations: An Exploratory Study ranks several disciplines related to the field of computing in Google Scholar Citations using the K-Means algorithm. The disciplines analyzed were: Data Mining, Artificial Intelligence, Bioinformatics, Information Retrieval, Machine Learning and Pattern Recognition. When analyzing these disciplines, the aim of the authors was to verify the general citation patterns, the correlation between metric indices, the personal citation patterns of researchers and the transformation of research topics over time. The metrics "are used to quantify the impact of an individual's research result, although it is far from sufficient to compare and evaluate the research work in a comprehensive manner". HUANG; YUAN (2012) “our translation”; The fourth article is entitled Machine Learning Algorithms applied in Automatic Classification Of Social Network Users and uses machine learning algorithms in the Scientia.net social network to verify which of the algorithms (in terms of paradigm and individual) has the best performance in user ranking. The authors used 4 artificial intelligence algorithms. Two of them have the supervised paradigm (Neural Networks and Support Vector Machine) and the other two have the unsupervised paradigm (K-Means and Konohen Networks). In this case, the work does a verification in terms of paradigm (analyzing the algorithms separated in paradigm) and individual (checking the execution of each one individually). The fifth article is called Automatic labeling of social network users Scientia.Net through the machine learning supervised application and is signed by the same authors of the previous work, with the collaboration of Lucas Lopes, and uses almost the same methodological procedure as the work mentioned above. Its difference is in the Labeling process applied. This process is based on the use of an unsupervised algorithm to be used in the definition of clusters and then apply a supervised algorithm to each attribute of each cluster. The evaluation of the unsupervised algorithms can identify which attributes are relevant to the problem. The authors intend to classify users in Scientia.net using K-Means (unsupervised algorithm for cluster generation) and Multilayer Perceptron Neural Networks (supervised algorithm for data training).

5 FINAL CONSIDERATIONS Based on the results presented, it is possible to notice that there is little research related to the Recognition of Standards in Scientific Social Networks, being thus considered this topic as a research novelty. One of the possible reasons for few studies is due to the barriers encountered when conducting any research involving these networks, as the difficulties in accessing data become complicated once the

33

South Florida Journal of Development, Miami, v.1, n.2, 22-36, apr./jun. 2020. ISSN 2675-5459

Scientific Social Networks gain commercial interest. This phenomenon is rejected by a part of the scientific community that are advocates of open science for society.

34

South Florida Journal of Development, Miami, v.1, n.2, 22-36, apr./jun. 2020. ISSN 2675-5459

REFERENCES

ALHEYASAT, O. Investigation and Analysis of Research Gate User’s Activities using Neural Networks.The International Arab Journal of Information Technology v. 13, n. 2, p. 320-325, mar. 2016. BIOLCHINI, J; MIAN, P; NATALI, A; CONTE, T; TRAVASSOS, G. Scientific Research Ontology to Support Systematic Review in Software Engineering. Advanced Engineering Informatics. v. 21, n. 2, p. 133–151, apr. 2007.

BJÖRNEBORN, L. Small-World Link Structures across an Academic Web Space: A Library and Information Science Approach, 2004. Royal School of Library and Information Science. Tese de Doutorado. 399p.

BUTLER, J; et. al. A. The Evolution of Current Research Impact Metrics: From Bibliometrics to Altmetrics? Clinical Spine Surgery. v. 30, n. 5, p. 226-228, jun. 2017.

FENNER, Martin. Altmetrics and Other Novel Measures or Scientific Impact. Opening Science: The Evolving Guide on How the Internet is Changing Research, Collaboration and Scholarly Publishing. Springer Cham Heidelberg New York Dordrecht London, 2014.

FISHER, C.; THOMPSOM, C.; BROOKES, R.; Gender differences in the Australian undergraduate STEM student experience: a systematic review. Higher Education Research & Development, DOI: 10.1080/07294360.2020.1721441, feb. 2020.

HUANG, Z; YUAN, B. Mining Google Scholar Citations: An Exploratory Study. In: International Conference Ingeligent Computing Technology – ICIC, 8, China. Lecture Notes in Computer Science - LNCS 7389, 2012, p. 182–189.

KHOSRAVIPOUR, M.; KHANLARI, P. The association between road traffic noise and myocardial infarction: A systematic review and meta-analysis. Science of The Total Environment. v. 731, n. 4. may. 2020.

KITCHENHAM, B.A., Dyba˚, T., Jorgenson, M. Evidence-Based Software Engineering. In: Proceedings of ICSE. IEEE Computer Society Press, p. 273–281. Mai. 23-28, 2004.

LE, P. et. al. Present focuses and future directions of decision-making in construction supply chain management: a systematic review. International Journal of Construction Management. v. 20, n. 5, p. 490-509, oct. 2018.

Library, Information Science and Technology Abstracts. Subjects Include. Available in: . Accessed in: June 17th, 2020.

LIMA, B; MACHADO, V. Machine Learning Algorithms applied in Automatic Classification Of Social Network Users. IV Congresso Tecnológico TI e Telecon. INFOBRASIL, 2011. Original em Português.

LIMA, B; LOPES, L; MACHADO, V. Automatic labeling of social network users Scientia.Net through the machine learning supervised application.Social Network Analysis and Mining. Jul, 2015.

35

South Florida Journal of Development, Miami, v.1, n.2, 22-36, apr./jun. 2020. ISSN 2675-5459

NOORDEN, R. V. Online Collaboration: Scientists and the social network. Available in: . Accessed in: June 15th, 2020.

Portal de Periódicos Capes, Acervo. Available in: . Accessed in: June 17th, 2020.

PRIEM, J. TARABORELLI, P. GROWTH, C. NEYLON, C., Altmetrics: A manifesto. Out. 2010. Available in: . Accessed in: June 15th,2020.

ORTEGA, J. Differences and evolution of scholarly impact in Google Sholar Citations profiles: An application of Decision Trees. Revista Española de Documentación Científica. v. 38. n. 4. 2015.Originalmente em Espanhol.

ROCHA, M., CORTEZ, P. & Neves, J. Análise Inteligente de Dados – Algoritmos e Implementação em Java. Lisboa: FCA – Editora de Informática, 2008. SOMBRA, T. et. al. Redes Sociais Científicas e Inteligência Artificial – Uma Revisão Sistemática aplicada a Reconhecimento de Padrões. Brazilian Journals of Development. v. 6, n. 3, p. 9957-9970, mar. 2020.

36