Report:

Information and tips related to search engines like Google Scholar and other ways your work is and can be more visible on the net.

A personal perspective

Dr. R. C. A. Keller

Date: 2019 Nijmegen, the Netherlands 1st edition

1

Preface This document gives a summary of the (hard) lessons I learned during the time I investigated and utilized the internet while trying to make my work more visible. While normally using Google Scholar as my main source of information I came across many other platforms, search engines and so on. One thing led to another. I sincerely hope that readers will feel any use of my attempt to inform those interested in what I learned during my quest on the world wide web and are because of this able to work things through way faster and more efficient. Comparison of academic search engines and/or databases where a researcher can promote their work and get ideas on how their own work is perceived in terms of views, citations and so on. In short: Google scholar (sorry Researchgate…) is to me the most frequently used and most comprehensive database to search for papers (and gives you one of the best, if not the best, indication about the number of times a paper is cited). One of the most annoying drawbacks…you have no influence what so ever on the citation count (although there are things missing or falsely assigned).

Second best is Researchgate. The database is pretty complete and contains (often) full text versions of the papers which is extremely convenient. On top of that the forum, where one can asked questions or advise and the subsequent ability to share your expertise or give an educated guess when it comes down to a clue for an answer, is a great feature. One of the most annoying drawbacks…besides that you have no influence what so ever on the citation count is the observation that their search engine misses a lot (too much I think) of references of papers in their own database (even those with DOI info etc. are too often not picked up/recognized).

Microsoft Academic Search is becoming (in their second attempt) increasingly more of an alternative to Google Scholar. It is not as comprehensive (yet) but one can feel that they are coming closer and closer to become a real player in the field of search engines and citation overviews. One of the drawbacks is that they rely on the Bing search engine. Although Bing is pretty good it is clear that Google is (still) superior in terms of coverage, speed, degree of being up to date and so on. Because of this it will be hard if not impossible to outperform Google Scholar.

ScienceOpen is a potential interesting one. Strongly relying on ORCID data, it is able to pick up quite a number of your own papers (including pretty recent ones) and, unlike Google Scholar

2 and Microsoft Academic Search, it allows you as author a bit more influence on which papers are displayed in their database and how. An obvious drawback is that their database and set-up only give a pretty poor citation count which is not even close to Google Scholar and Researchgate.

Kudos has presented itself as a real innovation in this field. The novelty lies in the fact that you can present your work in a new way. For example more focussed on a non-expert audience and highlighting your personal ideas about the relevance and possible impact of the work. An important drawback is that their database is rather limited and the community involved in this initiative is small and apparently not that active.

SemanticScholar is becoming increasingly relevant. Going through a transition from a computational oriented engine towards a way more broader field this might be something to have a closer look at. A drawback is that their database is based on a rather unique kind of search engine that needs some help from time to time to go into the right direction. Despite a reasonably good help function behind all this all seems to work rather slow or to put it mildly: it requires quite some patience.

PubFacts is a search engine, database and (social network)forum in one. It is primarily based on PubMed data when it comes to publications (and citations). When publishing in fields not related to relevant subjects that are included in PubMed it is hard (if not impossible) to fully integrate this here. However they have a rather novel way to highlight work and profiles of researchers. The novelty of PubFacts is related to their point system which basically gives incentives for exposing your work on Social Media platforms. Overall impression is that they represent a moderate active community of researchers. A drawback is that their database and set-up only give a pretty poor citation count which is not even close to Google Scholar and Researchgate because they ‘limit’ themselves to PubMed data..

Scinapse is another search engine. It uses data obtained from Microsoft Research, , PubMed and Springer Nature.

3

A drawback is that to me it is somewhat unclear what this service add to what is already available. But at least there is yet another service available that helps you to search the literature and allows you to make collections of interesting papers to read and/or use later on.

Dimensions presents itself as “a next-generation linked research information system”. It is well-linked to ORCID and uses data provided by for example Altmetric. It can be a convenient way to add (missed) publications to your ORCID record. A drawback is that I personally am not impressed by their search engine. Too many unrelated items show up once you enter some keywords. It is also not particularly user-friendly. It takes times to get the most of it to put it mildly.

ORCID is not so much a real search engine but is relevant for other search engines and databases described here. For example as said above ScienceOpen pretty much rely on this. Originally ORCID is meant to provide the ‘final’ solution for linking the right author to the right paper. Or put in other words it is a way for an individual author to claim his or her original work. It is expected to work better and better once more and more journals require an ORCID ID linked to the author(s) that submit a manuscript. Together with the well- established DOI identifier the right author is linked to the right paper unambiguously.

In detailed description including tips: Google Scholar is as we all know a specialized search engine from Google fully focused on scholarly work (in the broadest meaning of the word). Nice features include stuff like different versions of a particular paper and number of times cited (Cited by). Nice overviews on how to use Google Scholar can be found everywhere. See for example: https://www.wikihow.com/Use-Google-Scholar or https://scholar.google.nl/intl/en/scholar/about.html Numerous studies have been published to see how Google Scholar performs compared to well- established databases like Scopus and Web of . See for example [1]: https://www.semanticscholar.org/paper/Comparison-of-PubMed%2C-Scopus%2C-Web-of- Science%2C-and-Falagas-Pitsouni/2a5a64b5a740c8379ef2bd81f0c2d4e6ae66c1d8 Google Scholar allows one to create your own profile. It will look like this:

4

Tip nr. 1: How to get stuff included that can be influenced by yourself. If you want something (paper, thesis, project description, preprint etc.) included in Google Scholar you could try Researchgate. Documents uploaded there are (most of the times) quickly (within a week approximately) visible in the ‘normal’ Google. Depending on (well…what exactly remains completely unclear, Google magic I guess) they are picked up by Google Scholar (varying from a couple of days up to more than half a year, if ever). The same is/was true for Academia.edu. To a lesser extent it counts for SemanticScholar, ScienceOpen as well.

Tip nr. 2: How to get pdf files included. See tip nr. 1. Furthermore you can try SemanticScholar as well. Once a publication of you is included in their database it will be picked up by Google Scholar eventually. ScienceOpen is picked up by Google Scholar as well. Although only papers that already are Open Access are included in such a way by ScienceOpen. Of course when working in a University and/or an Applied Research Institute the publications present in their depositories are (often) picked up by Google Scholar as well.

Researchgate is often called the Facebook for scientists and for those directly or indirectly involved in research. The database is pretty complete and contains (often) full text versions of the papers which is as said extremely convenient. There is quite some debate about their troublesome relationship with the (big) publishers, this topic is beyond the scope of this report. One interesting reference I like to share that pretty summarizes my point of view as well [1]. On top of that the forum, where one can asked questions or advise and the subsequent ability to share your expertise or give an educated guess when it comes down to a clue for an answer, is a great feature and is pretty active. Last but not least Researchgate gives you some information about the interest in your work. Let’s face it, one of the main reasons why you should spent some of your precious time in something like Researchgate is the fact that you want your work to be noticed, read and of course possibly be of any use to others. The so-called Stats feature gives you information about the reads of your papers/work, the number of citations and recommendations (divided in recommendations for your answers, your publications or added projects). The RG Score is a somewhat awkward metric feature that leads (inside the Researchgate community) to quite some debate. However regardless what one might think about it and no matter how much one

5 can argue about its value I personally experience that the Score is surprisingly indicative for the relative status of a particular researcher. Similarly the so-called Total Research Interest is giving (strongly based on the number of publications and citations) another relative indicator about the interest and relevance of your work.

Tip nr. 1: Use the DOI feature for anything else but an already ‘official’ publication. If you have a preprint, a chapter of your thesis that you want to highlight, a report and so on, you can use the in-built feature in Researchgate to create an (RG) DOI for such a document. One advantage is that by this way you can easily add such a document to your ORCID record and subsequently this is picked up by or can be added to for example Publon (ResearcherID) and ScienceOpen.

Tip nr. 2: Stats, RG Score and Total Research Interest. The so-called Stats feature gives you as said some information about the reads of your papers/work, the number of citations and recommendations (divided in recommendations for your answers, your publications or added projects). You can see how many of your reads are from RG members and how many come from ‘outside’. The Score value is largely depending on your publications (number of publications, impact factor, number of citations and so on). However Researchgate also wants to reward your activity on the forum. So (preferably meaningful) answers to questions, asking questions, the number of followers within Researchgate can attribute to a higher Score. The idea is that not only your publications but also your activity within the scientific community of Researchgate determines your influence/impact in research. The Total Research Interest value is informative for yourself (and those interested in you). The idea is that not only your number of citations but also the number of reads tells something about the (potential) impact of, and interest in, your work. Adding as much as possible related to your work (supplements and, as said above, other types of scholarly work) can generate more reads and thereby gives a higher Total Research Interest score.

ScienceOpen pretty much rely on ORCID data. After some sort of checking publications are validated and after that you are able to add additional info to your papers. That is you can add thumbnails, extra data files and so on. Also the number of views is updated together with some other statistics. Tip: Be patient. It is not always clear why one paper is ‘validated’ immediately, while others seems to take an awful long time. Since the system seems to work the best with the data imported from your ORCID record it can be worthwhile to put a bit of work in it according to the well-known ‘trial and error’ method: meaning that you can vary the so-called preferred source in ORCID and see (after updating your ORCID record in ScienceOpen which you find underneath the dashboard option) if this helps the search engine behind all this to get your paper validated.

6

SemanticScholar is working slow, but with a “help team” behind all this you can get the search engine work for you. It ultimately finds pretty much all of your published work.

Tip: Be patient. As said when it comes to build up a personal record where all your published work is correctly assigned to you then it is good to realize that SemanticScholar is working slow. However the “help team” actually respond and some of the things are a matter of a couple of working days while integrating other matters will take two to three weeks.

PubFacts is as said a search engine, database and (social network)forum in one. It is primarily based on PubMed data when it comes to publications (and citations). When publishing in fields not related to relevant subjects that are included in PubMed it is hard (if not impossible) to fully integrate this here. Still a lot of members include non- PubMed material to their profile.

Tip: Be patient. Like with a number of earlier mentioned services it is here also a matter of patience. When it comes to build up a personal record where all your published work is correctly assigned to you then it is good to realize that also PubFacts is working slow. However also here the “help team” actually respond and gradually you see a correct personal profile is building up. Some of the things are a matter of a couple of working days while other matters seems to take longer.

Some data and additional information In order to have some idea about how the different search engines/databases perform in terms of citation counts I used myself as ‘typical’ example. Based on this single case study (n=1) the trends are clear. While the is taken as standard (100%) since it is, no matter whether one agrees or not, the reference for evaluation procedures etc.. It is clear that both Google scholar and Microsoft Academic Search give way more citations from scholarly work up to 30% more, see Figure 1 for more details. This substantially higher count is obviously due to the fact that way more scholarly work is included in their databases. Researchgate and Scopus give substantially more citations than Web of Science as well. Researchgate indexes more types of scholarly work once included in their database while Scopus also includes book chapters for example. CrossRef is giving reasonable number of citations, it is less than Web of Science presumably because not all journals make use of their service and older literature is simply not included. PubMed (or better PMC) is giving lower citation counts as well, this is primarily due to the fact that they limit themselves to a smaller field of science. ScienceOpen is relatively new and their way lower count is primarily due to the fact that their search engine simply does not allow a more accurate citation count (yet). For a way more scientifically proper analysis see elsewhere [2-5].

7

Figure 1: Comparison of different services that provide information about the number of times your work is cited. The number of ISI Web of Science is set on 100%. In my personal view Google Scholar gives, when it comes to times cited, the most justified indication of the impact of your scholarly work. Microsoft Academic Search (although their count is an estimate and looking into the ‘real’ counts they are substantially lower), and Researchgate come close to the numbers seen in Google Scholar. Still ISI Web of Science (and sometimes Scopus as well) is currently the official standard of evaluating someone’s scientific impact in the scientific community. There is a lively debate on whether citations should be so prominent in scientific performance evaluation [6-8] but this goes beyond the scope on what I am trying to tell and discuss in this document. Figure 2 indicates some of the links between the different search engine/database services available. It is clear that Google Scholar and ORCID form rather central key players in this web of different sources.

8

Figure 2: An indication of the links between the different services described in this document.

FAQ Is it possible to add publication in Google Scholar manually. Answer is basically NO. You can add details about a publication in your own Google Scholar profile, but as far as I can tell this has no influence whatsoever on what actually appears in the Google Scholar database. This is filled with files that are mysteriously collected online. Why do citation counts differ so much in Google Scholar vs Researchgate for example. This is because of the difference in the way data are collected. Google Scholar counts basically everything they can find for your publication, although they depend on a large degree on how things are presented to them. Because of this latter point Google Scholar is (still) missing a certain degree of citations but is nevertheless the most comprehensive collector of citations to date (see for example [2]). Researchgate depends on a large degree on the papers present in their own database, whereas Kudos for example rely on what CrossRef tells them. Microsoft academic search rely on their own Bing search engine. Personally I find Google Scholar most informative (and in a way the best indicator for the impact of your work), Web of Science though is still the most accepted indicator of someone’s impact in the various Scientific bodies.

9

Is Open Access always better than publishing in a journal behind a paywall? The obvious advantage of Open Access is of course the possibility for EVERYONE to read your paper. Nothing is more annoying than finally finding that particular article you’re looking for and not being able to read the full-text. The obvious disadvantage is the sometimes ridiculous costs some journals ask for publication (sometimes more than 3000 euro…). It has been argued that Open Access lead to more citations [9]. On average up to a staggering 52%! Although a number of arguments against this phenomenon can be brought up (see also literature cited in [9]), it seems almost a fact that Open Access leads to more citations. I cannot help it to ‘check’ this so-called fact for my own personal situation. In Figure 3 you can see that for me it is the other way around! More citations for papers published in non-Open Access journals (also after an attempt to correct for the fact that most closed publications are older and are now calculated as the number of citation per year the trend is the same: more citations in closed papers). So much for the fact that Open Access always leads to more citations, and of course this means nothing but a call to check this for your own personal situation.

Figure 3: Average of citations per publication in closed (blue) and Open Access (red) journals. Left uncorrected and right corrected for the number of year the publication is already published. Closed is taken as 1.

Thanks Let me express a word of thanks for those who took the effort and spend their precious time for reading what I have to say. This document represents quite a number of hours (don’t really want to think about how much really…) to figure out how things really work and the ‘logic’ behind all the different systems (search engines, databases, etc.). As you noticed I focussed myself on stuff that is for free. Because of this well-known tools like Scopus and Web of Science are not included and Academia.edu is mentioned only briefly (since they have put most of their features behind a payed (premium) account). I sincerely hope that this overview is of any help to anyone who is directly or indirectly involved in research.

10

References: [1] Velterop, J. Older journal articles need to be open, too [online]. SciELO in Perspective, 2017 [viewed 23 April 2019]. Available from: https://blog.scielo.org/en/2017/11/22/older- journal-articles-need-to-be-open-too/ [2] Falagas, M. E., Pitsouni, E. I., Malietzis, G. A., & Pappas, G. (2008). Comparison of PubMed, Scopus, web of science, and Google scholar: strengths and weaknesses. The FASEB journal, 22(2), 338-342. [3] Bakkalbasi, N., Bauer, K., Glover, J., & Wang, L. (2006). Three options for citation tracking: Google Scholar, Scopus and Web of Science. Biomedical digital libraries, 3(1), 7. [4] Bar-Ilan, J. (2008). Which h-index?—A comparison of WoS, Scopus and Google Scholar. Scientometrics, 74(2), 257-271. [5] Delgado López-Cózar, E., Orduna-Malea, E., Martín-Martín, A., & Ayllón, J. M. (2017). Google scholar: the big data bibliographic tool. In: Cantu-Ortiz, fj. (ed.). Research analytics: boosting university productivity and competitiveness through Scientometrics (pp. 59-80). CRC Press (Taylor & Francis). ISBN: 978-1498785426. [6] Seglen, P. O. (1997). Why the impact factor of journals should not be used for evaluating research. Bmj, 314(7079), 497. [7] Amin, M., & Mabe, M. A. (2003). Impact factors: use and abuse. Medicina (Buenos Aires), 63(4), 347-354. [8] Bornmann, L., & Marx, W. (2016). The journal Impact Factor and alternative metrics: A variety of bibliometric measures has been developed to supplant the Impact Factor to better assess the impact of individual research papers. EMBO reports, 17(8), 1094-1097. [9] Archambault, Éric; Côté, Grégoire; Struck, Brooke; and Voorons, Matthieu, "Research impact of paywalled versus open access papers" (2016). Copyright, Fair Use, Scholarly Communication, etc.. 29. http://digitalcommons.unl.edu/scholcom/29

11