: open indexing for the community

Marcel R. Ackermann

July 9, 2019

SCHLOSS DAGSTUHL 1 / 24 Marcel R. Ackermann Leibniz Center for Informatics OpenOpen scholarly scholarly indexing indexing for for the the community community What is dblp?

EST. EST. 19931993

4.6 million computer science publications 85,000 table of contents

SCHLOSS DAGSTUHL 2 / 24 Marcel R. Ackermann Leibniz Center for Informatics OpenOpen scholarly scholarly indexing indexing for for the the community community What is dblp?

Directory of journals and conference series EST. Directory of journals and conference series EST. 19931993

4.6 million computer science publications 85,000 table of contents 5600 conference series 1600 journals

SCHLOSS DAGSTUHL 3 / 24 Marcel R. Ackermann Leibniz Center for Informatics OpenOpen scholarly scholarly indexing indexing for for the the community community What is dblp?

Directory of journals and conference series EST. Directory of journals and conference series EST. 19931993 CuratedCurated author author bibliographies bibliographies

4.6 million computer science publications 85,000 table of contents 5600 conference series 1600 journals 2.3 million authors and editors homonym/synonym disambiguation SCHLOSS DAGSTUHL 4 / 24 Marcel R. Ackermann Leibniz Center for Informatics OpenOpen scholarly scholarly indexing indexing for for the the community community What is dblp?

Directory of journals and conference series EST. Directory of journals and conference series EST. 19931993 CuratedCurated author author bibliographies bibliographies

OpenOpen data data & & APIs APIs

4.6 million computer science publications open metadata and query APIs 85,000 table of contents whole dataset available as dump 5600 conference series license (ODC-by) 1600 journals 2.3 million authors and editors homonym/synonym disambiguation SCHLOSS DAGSTUHL 5 / 24 Marcel R. Ackermann Leibniz Center for Informatics Our key principles

 Openness – all data is freely available for reuse

 Neutrality of the data set – additions selected according to significance to the community, no publisher preference, supervised by advisory board

 Data quality – continuous curation & disambiguation

 Semantic enrichment – adding structure and linking to external resources

 Enabling of research – by providing means for literature search, and as a research data set

 User orientation – build to meet the needs of the international computer science research community

SCHLOSS DAGSTUHL 6 / 24 Marcel R. Ackermann Leibniz Center for Informatics What dblp is not ... following Library Standards Digital Library Safeguard against Fake Science Rankings / Ratings anything else but CS Business Plan A huge team

SCHLOSS DAGSTUHL 7 / 24 Marcel R. Ackermann Leibniz Center for Informatics A brief history of dblp

400.000+ new records per year

1993: Michael Ley 2000: Funding 2010: donations by Tschira Foundation starts dblp at Trier by ACM SIGMOD University as a Anthology 2011: Schloss Dagstuhl one-person project joins co-maintaining dblp; formingformation of ofdblp the team dblp team 2002: DFG projects 1997: ACM SIGMOD (IO-Port/SemIPort); Contributions Award donations by VLDB 2018: Schloss Dagstuhl takes and Microsoft over full responsibility of running dblp; permanent

SCHLOSS DAGSTUHL 8 / 24 staff and funding Marcel secured R. Ackermann Leibniz Center for Informatics How are new venues added to dblp?

publisher‘s data deliveries metadata discovery by submission dblp team community suggestions

 indexing on a volume-by-volume basis  once a venue is selected, we aim to include all past and future volumes

SCHLOSS DAGSTUHL 9 / 24 Marcel R. Ackermann Leibniz Center for Informatics Criteria for inclusion (ideally)

 as defined by our Advisory Board Aspects of the venue Aspects of authors/editors

 topics from Comp. Science  board/committee consist of  discernible thematic focus distinguished experts  regular publishing cadence  authors reknown experts  publisher or society support  internationality

Publication standards Accessibility

 original works  metadata openly available,  peer-review process complete, and unambigous  typesetting, structure,  long-term availability of content conventions full-texts  aimed at international  persistent IDs (eg, DOI) readers  autom. metadata retrieval

SCHLOSS DAGSTUHL 10 / 24 Marcel R. Ackermann Leibniz Center for Informatics Discovery: “community prominence“

Publishers internalinternal XML XML format format

Gregor Kemper Using extended Derksen ideals in computational invariant theory. metadata crawlers 161-181 2016 72 J. Symb. Comput. db/journals/jsc/jsc72.html
WWW

ToC analysis internal discovery DB 1) match authors with dblp 2) „prominence“ ≈ size of dblp profile 3) statistics of average article

c.f.: Neumann et al.: Prioritizing and SCHLOSS DAGSTUHL 11 / 24Scheduling Conferences Marcel for Metadata R. Ackermann Leibniz Center for Informatics Harvesting in dblp. JCDL 2018: 45-48 There are lots of dodgy applications ...

From:From: Dr.Dr. T.T. DearDear Sir/madamSir/madam InIn orderorder toto provideprovide aa broadbroad andand timelytimely coveragecoverage ofof ever-evolvingever-evolving fieldfield ofof computercomputer scincesscinces andand engineering,engineering, thethe IJXXXXXIJXXXXX offersoffers iteite readersreaders anan OPENOPEN accessaccess OnlineOnline journaljournal withwith aa mixmix ofof regularregular andand specialspecial issuesissues whichwhich wouldwould contributecontribute newnew andand advancedadvanced resultsresults inin thethe filedfiled ofof thethe computercomputer sciencescience andand engineering.engineering. […][…] ToTo maintainmaintain thethe standardstandard ofof JournalJournal andand toto makemake thethe presentationpresentation ofof articlesarticles scholarly,scholarly, wewe wouldwould likelike ourour journaljournal toto bebe indexedindexed byby you.you. YouYou areare requestedrequested toto sendsend youryour confirmationconfirmation forfor thethe same.same. ThanksThanks Dr.Dr. T.T.

SCHLOSS DAGSTUHL 12 / 24 Marcel R. Ackermann Leibniz Center for Informatics There are lots of dodgy applications ...

From:From: Dr.Dr. T.T. DearDear Sir/madamSir/madam IJXXXXXIJXXXXX InIn orderorder toto provideprovide aa broadbroad andand timelytimely coveragecoverageEditorial Board: of ever-evolving field of computer scinces andEditorial Board: of ever-evolving field of computer scinces ●and Prof. B. engineering, the IJXXXXX offers ite readers an● Prof. B. engineering, the IJXXXXX offers ite readers● Prof.an J. OPEN access Online journal with a mix of regular● Prof. J. OPEN access Online journal with a mix of regular● Prof. W. and special issues which would contribute new● Prof. W. and special issues which would contribute new● … ● … andand advancedadvanced resultsresults inin thethe filedfiled ofof thethe computercomputer sciencescience andand engineering.engineering. […][…] ToTo maintainmaintain thethe standardstandard ofof JournalJournal andand toto makemake thethe presentationpresentation ofof articlesarticles scholarly,scholarly, wewe wouldwould likelike ourour journaljournal toto bebe indexedindexed byby you.you. YouYou areare requestedrequested toto sendsend youryour confirmationconfirmation forfor thethe same.same. ThanksThanks Dr.Dr. T.T.

SCHLOSS DAGSTUHL 13 / 24 Marcel R. Ackermann Leibniz Center for Informatics There are lots of dodgy applications ...

From:From: Dr.Dr. T.T. DearDear Sir/madamSir/madam IJXXXXXIJXXXXX From: Prof. B. From: Prof. B.InIn orderorder toto provideprovide aa broadbroad andand timelytimely coveragecoverageEditorial Board: of ever-evolving field of computer scinces andEditorial Board: of ever-evolving field of computer scinces ●and Prof. B. […] Indeed nowengineering, looking carefully the IJXXXXX at my offers own emails ite readers between an● Prof. B. […] Indeed nowengineering, looking carefully the IJXXXXX at my offers own emails ite readers between● Prof.an J. myself and IJXXXXXOPEN access show aOnline different journal story. with I awas mix invited of regular to● Prof. J. myself and IJXXXXXOPEN access show aOnline different journal story. with I awas mix invited of regular● Prof.to W. contribute toand the special journal issues and buried which furtherwould contribute down in the new● Prof. W. contribute toand the special journal issues and buried which furtherwould contribute down in the new● … same email toand become advanced a member results of thein theeditorial filed ofboard. the computerMy ● … responsesame email was to thatand become advancedI had a memberno resultstime of to the incontribute. theeditorial filed Theyofboard. the replied computerMy response was sciencesciencethat I hadandand noengineering.engineering. time to contribute. […][…] They replied thankingthanking meme andand askingasking whetherwhether theythey couldcould requestrequest mymy adviceadvice In the futureTo to maintain which I thesaid standard sure. So of I Journalnever formally and to make agreedIn the tofuture be onTo to theirmaintain which board. I thesaid […]standard sure. So of I Journalnever formally and to make agreed to be thetheon theirpresentationpresentation board. […]ofof articlesarticles scholarly,scholarly, wewe wouldwould likelike ourour journaljournal toto bebe indexedindexed byby you.you. YouYou areare requestedrequested toto sendsend youryour confirmationconfirmation forfor thethe same.same. ThanksThanks Dr.Dr. T.T.

SCHLOSS DAGSTUHL 14 / 24 Marcel R. Ackermann Leibniz Center for Informatics There are lots of dodgy applications ...

From:From: Dr.Dr. T.T. DearDear Sir/madamSir/madam IJXXXXXIJXXXXX From: Prof. B. From: Prof. B.InIn orderorder toto provideprovide aa broadbroad andand timelytimely coveragecoverageEditorial Board: of ever-evolving field of computer scinces andEditorial Board: of ever-evolving field of computer scinces ●and Prof. B. […] Indeed nowengineering, looking carefully the IJXXXXX at my offers own emails ite readers between an● Prof. B. […] Indeed nowengineering, looking carefully the IJXXXXX at my offers own emails ite readers between● Prof.an J. myself and IJXXXXXOPEN access show aOnline different journal story. with I awas mix invited of regular to● Prof. J. myself and IJXXXXXOPENFrom: access show Prof. aOnline different J. journal story. with I awas mix invited of regular● Prof.to W. contribute toand the special From:journal Prof. issues and J.buried which furtherwould contribute down in the new● Prof. W. contribute toand the special journal issues and buried which furtherwould contribute down in the new● … same email to become a member of the editorial board. My ● … same email toandand become I advanced advancedhave a justmember resultsresults seen of that theinin the theeditorialI am filedfiled listed ofofboard. theasthe an computercomputerMy advisory editor response was sciencethat II have hadand nojustengineering. time seen to thatcontribute. […] I am listed They asreplied an advisory editor response was sciencethatofof I this this hadand journal. nojournal.engineering. time to PleasePlease contribute. […] letlet meme knowTheyknow whorepliedwho thethe editoreditor inin thanking me and askingchief is.whether they could request my advice Inthanking the future me and to askingwhichchief Iis.whether said sure. they Socould I never request formally my advice In the futureToTo to maintainmaintain which I thethesaid standardstandard sure. So ofof I JournalJournalnever formally andand toto makemake agreed to be theon theirpresentation board. […]of articles scholarly, we would agreed to be theon * their*IpresentationI nevernever board. agreedagreed […]of to toarticles bebe onon this thisscholarly, editorialeditorial we boardwouldboard soso pleaseplease likelikeremove ourour journal journalmy name to toImmediately.* bebe indexedindexed byby Please you.you. YouletYou aremeare know why I have requestedremove to my send name your Immediately.* confirmation Please for thelet same.me know why I have requestedbeenbeen included.included. to send ThisyourThis informationconfirmationinformation may mayfor be bethe neededneeded same. forfor furtherfurther action.action. ThanksThanks Dr. T. Dr.I I T. suspectsuspect thisthis maymay bebe aa scamscam journal.journal. NoNo linkslinks toto anyany pastpast issuesissues workwork forfor meme onon thethe webpage.webpage.

SCHLOSS DAGSTUHL 15 / 24 Marcel R. Ackermann Leibniz Center for Informatics There are lots of dodgy applications ...

From:From: Dr.Dr. T.T. DearDear Sir/madamSir/madam IJXXXXXIJXXXXX From: Prof. B. From: Prof. B.InIn orderorder toto provideprovide aa broadbroad andand timelytimely coveragecoverageEditorial Board: of ever-evolving field of computer scinces andEditorial Board: of ever-evolving field of computer scinces ●and Prof. B. […] Indeed nowengineering, looking carefully the IJXXXXX at my offers own emails ite readers between an● Prof. B. […] Indeed nowengineering, looking carefully the IJXXXXX at my offers own emails ite readers between● Prof.an J. myself and IJXXXXXOPEN access show aOnline different journal story. with I awas mix invited of regular to● Prof. J. myself and IJXXXXXOPENFrom: access show Prof. aOnline different J. journal story. with I awas mix invited of regular● Prof.to W. contribute toand the special From:journal Prof. issues and J.buried which furtherwould contribute down in the new● Prof. W. contribute toand the special journal issues and buried which furtherwould contribute down in the new● … same email to become a member of the editorial board. My ● … same From:email Dr.toandand becomeT. I advanced advancedhave a justmember resultsresults seen of that theinin the theeditorialI am filedfiled listed ofofboard. theasthe an computercomputerMy advisory editor responseFrom: was Dr.sciencethat T. II have hadand nojustengineering. time seen to thatcontribute. […] I am listed They asreplied an advisory editor response was sciencethatofof I this this hadand journal. nojournal.engineering. time to PleasePlease contribute. […] letlet meme knowTheyknow whorepliedwho thethe editoreditor inin thankingMay meI clarify:and askingchief is.whether they could request my advice Inthanking the Mayfuture meI clarify:and to askingwhichchief Iis.whether said sure. they Socould I never request formally my advice In the futureToTo to maintainmaintain which I thethesaid standardstandard sure. So ofof I JournalJournalnever formally andand toto makemake agreed1. to The be IJXXXXXtheon their*presentationI neveris board. indexed agreed […]of and toarticles abstractedbe on thisscholarly, almosteditorial weby wouldallboard so please agreed1. to The be IJXXXXXtheon their*presentationI neveris board. indexed agreed […]of and toarticles abstractedbe on thisscholarly, almosteditorial weby wouldallboard so please worldlike likefamousremove ourour journal journalmy name to toandImmediately.* bebe now indexedindexed DBLP byisby Please you.theyou. , YouletYou aremeare know why I have worldrequested famousremove databasesto my send name your andImmediately.* confirmationnow DBLP is Please thefor database, thelet same.me know why I have where requestedwe beenarebeen trying included.included. to send to be ThisyourThis indexed. informationconfirmationinformation may mayfor be bethe neededneeded same. forfor furtherfurther where weaction. are trying to be indexed. 2.2. AllAll thethe membersmembersaction. areare givengiven theirtheir namesnames onlyonly afterafter theirtheir concerns,ThanksThanks but it seems that some misunderstanding has concerns,Dr. IT. suspect but it seemsthis maythat be some a scam misunderstanding journal. No links has to any occurredoccurredDr. onIT.on suspect Prof.Prof. J.'s J.'sthis part, part,may be whichwhich a scam wewe arejournal.are tryingtrying No to tolinks rectify.rectify. to any 3. THIS IS NOTpastpast A issuesSCAMissues JOURNAL. workwork forfor We me mehave onon theISSN.the webpage.webpage. 4.3. NOTHIS links IS NOTto theA SCAM previous JOURNAL. issues We haveare workingISSN. only because 4. recentlyNO links weto findthe previousthat some issues publishers are working are trying only tobecause download recently we find that some publishers are trying to download thethe paperspapers andand thenthen resellresell themthem toto thethe students.students. HoweverHowever allall thethe paperspapers publishedpublished areare stillstill havinghaving accessaccess toto bebe downloaded.downloaded. InIn particular,weparticular,we requestrequest youyou allall thatthat NOTNOT toto MISUNDERSTOODMISUNDERSTOOD usus and PLS DONTSCHLOSS MAKE DAGSTUHL ANY NEGATIVE PUBLICITY16 / 24 FOR THE JOURNAL. Marcel R. Ackermann and PLS DONTLeibniz CenterMAKE for InformaticsANY NEGATIVE PUBLICITY FOR THE JOURNAL. PIDs for conferences goal: force venues to commit to their story!

Crossref & DataCite working group  open conference metadata  no bibliometrics!  based on DOI infrastructure  supported by all major publishers in CS  work in progress!

SCHLOSS DAGSTUHL 17 / 24 Marcel R. Ackermann Leibniz Center for Informatics CHE German University Ranking Number of records retrieved

total persons: 2,607 total records: 25,719

SCHLOSS DAGSTUHL 18 / 24 Marcel R. Ackermann Leibniz Center for Informatics CHE German University Ranking Number of records retrieved

10 most frequent venues retrieved

total persons: 2,607 total records: 25,719

SCHLOSS DAGSTUHL 19 / 24 Marcel R. Ackermann Leibniz Center for Informatics CHE German University Ranking Number of records retrieved

10 most frequent venues retrieved

Fraction of flawed query names

average: 13.3 %

total persons: 2,607 total records: 25,719

SCHLOSS DAGSTUHL 20 / 24 Marcel R. Ackermann Leibniz Center for Informatics CHE German University Ranking Number of records retrieved

10 most frequent venues retrieved

Fraction of flawed query names

average: 13.3 %

Effect of query name cleaning total persons: 2,607 total records: 25,719 Independent of cleaning (TP): 23,743 Mismatched prior cleaning (FP): 626 New matches after cleaning (FN): 1,976

Jaccard difference: FP + FN = 9.9 % TP + FP + FN

SCHLOSS DAGSTUHL 21 / 24 Marcel R. Ackermann Leibniz Center for Informatics „Good Science“ – the tip of the iceberg?

Good Science?

Fake Science?

SCHLOSS DAGSTUHL 22 / 24 Marcel R. Ackermann Leibniz Center for Informatics „Good Science“ – the tip of the iceberg? Excellence! Good Science?

Satelite Workshop

Locally Relevant Community Workshop Meeting

Normal Science Ph.D. Workshop Fake Science? Work-in-progress Presentation Well-intended but badly organized

Fake Science SCHLOSS DAGSTUHL 23 / 24 Marcel R. Ackermann Leibniz Center for Informatics Thanks!

https://dblp.org

[email protected] @dblp_org

SCHLOSS DAGSTUHL 24 / 24 Marcel R. Ackermann Leibniz Center for Informatics