The MICCAI Hackathon on reproducibility, diversity, and selection of papers at the MICCAI conference

Fabian Balsigera,b,∗, Alain Jungoa,∗, Naren Akash R Jc, Jianan Chend, Ivan Ezhove, Shengnan Liuf, Jun Mag, Johannes C. Paetzolde, Vishva Saravanan Rh, Anjany Sekuboyinae, Suprosanna Shite, Yannick Sutera, Moshood Yekinii, Guodong Zengj, Markus Rempflerk,∗

aARTORG Center for Biomedical Engineering Research, University of Bern, Bern, Switzerland bSupport Center for Advanced Neuroimaging (SCAN), Institute for Diagnostic and Interventional Neuroradiology, Inselspital, Bern University Hospital, University of Bern, Bern, Switzerland cCenter for Visual Information Technology, IIIT Hyderabad, Hyderabad, India dDepartment of Medical Biophysics, University of , Toronto, Canada eTechnical University of , Munich, Germany fDepartment of Cardiology, Erasmus MC University Medical Center, Rotterdam, The Netherlands gDepartment of Mathematics, Nanjing University of Science and Technology, Nanjing, China hIIIT Hyderabad, Hyderabad, India iAfrican Masters of Machine Intelligence, Accra, Ghana jsitem Center for Translational and Biomedical Entrepreneurship, Bern, Switzerland kFriedrich Miescher Institute for Biomedical Research (FMI), Basel, Switzerland

Abstract The MICCAI conference has encountered tremendous growth over the last years in terms of the size of the community, as well as the number of contributions and their technical success. With this growth, however, come new challenges for the community. Methods are more difficult to reproduce and the ever-increasing number of paper submissions to the MICCAI conference poses new questions regarding the selection process and the diversity of topics. To exchange, discuss, and find novel and creative solutions to these challenges, a new format of a hackathon was initiated as a satellite event at the MICCAI 2020 conference: The MICCAI Hackathon. The first edition of the MICCAI Hackathon covered the topics reproducibility, diversity, and selection of MICCAI papers. In the manner of a small think-tank, participants collaborated to find solutions to these challenges. In this report, we summarize the insights from the MICCAI Hackathon into immediate and long-term measures to address these challenges. The proposed measures can be seen as starting points and guidelines for discussions and actions to possibly improve the MICCAI conference with regards to reproducibility, diversity, and selection of papers. Keywords: MICCAI, reproducibility, diversity, review, hackathon

1. Introduction to present and get to know the latest research re- lated to MICCAI. With an acceptance rate of ap- arXiv:2103.05437v2 [cs.CV] 28 Apr 2021 The MICCAI conference is an annual scientific proximately 30 % (Martel et al., 2020), the MICCAI meeting for research in conference is highly competitive and can be consid- (MIC) and computer assisted interventions (CAI). ered among the top scientific conferences research Approximately 2500 (MICCAI Society, 2020) re- in MIC and CAI. Besides the main conference that searchers attend the MICCAI conference every year stretches over three consecutive days, there exist two satellite event days, one before and one after the three days, which allow the community to orga- ∗Equal contribution and corresponding authors. The nize workshops, challenges, and tutorials dedicated other authors are listed alphabetically. to specific topics. At MICCAI 2020, for the first Email addresses: [email protected] (Fabian Balsiger), [email protected] (Alain Jungo), time, a hackathon was organized as such a satellite [email protected] (Markus Rempfler) event. the procedure is detailed in the supplementary ma- terial. The outcome of the first MICCAI Hackathon is summarized in this article. For the three topics reproducibility, diversity, and selection, immediate and long-term measures are presented in Section 2 to 4. Immediate measures might be realizable for Figure 1: The logo of the MICCAI Hackathon. upcoming MICCAI conferences and long-term mea- sures would most likely require more time and ef- The MICCAI Hackathon was initiated to ben- fort to implement. All presented measures base efit from the exchange among researchers during on the keynotes, mentoring sessions, contributions, the MICCAI conference. A hackathon is espe- and opinions of the MICCAI Hackathon organiz- cially well-suited as a satellite event at a conference, ers. Finally, a short discussion concludes the arti- where junior and senior researchers go to exchange, cle. We hope that the distilled measures could help discuss, and learn. Therefore, as a bottom-up ap- to potentially improve MICCAI, as reproducibility, proach, the MICCAI Hackathon benefits from this diversity, and selection of MICCAI papers concern fruitful environment in order to foster collabora- the whole MICCAI community. tive work. The MICCAI Hackathon can be consid- ered as a small think-tank rather than traditional 2. Reproducibility workshops with talks and poster session, podium discussions, lecture-based tutorials, hands-on ses- Reproducibility refers to that results of an exper- sions, and challenges known from satellite event iment should be achieved with similar or equal re- days. Accordingly, in this new format, participants sults when the experiment is performed by another gather and receive input from keynote speakers and researcher. In MICCAI, this often involves execut- mentors providing impulses about the hackathon’s ing computational methods on medical image data. topic. The participants then work individually or However, this is not trivial to achieve as most meth- together in teams to find solutions. Finally, they ods are highly specialized and often not publicly present their outcome at the end of the hackathon. available. For instance, the collection of MICCAI The first edition of the MICCAI Hackathon1 cov- 2020 papers with code (Ma, 2020) shows that there ered the topics reproducibility, diversity, and selec- is still only a fraction of the papers that share their tion of MICCAI papers. The advance of machine code and thereby facilitate reproducibility. Further, learning has had a considerable impact on MICCAI, as MICCAI often involves protected medical data, pushing the limits of algorithms, opening up com- sharing that data might not be straightforward or pletely new applications and ultimately, leading to even impossible. Therefore, in the category repro- an increased overall interest in MICCAI. With this ducibility, two questions were specifically investi- success, however, came new challenges for the com- gated: munity. Complex, data-driven algorithms are more difficult to reproduce and the ever-increasing num- • What does it need for a MICCAI paper to be ber of paper submissions to the MICCAI conference reproducible? poses new questions regarding the selection process • What could MICCAI do to encourage repro- and the diversity of topics. To provide inputs to the ducibility? participants regarding these topics, two keynotes were given and six mentoring sessions were held. Three out of five contributions addressed the cat- Overall, five contributions were received providing egory of reproducibility. Overall, reproducibility ideas to tackle the challenges of reproducibility, di- was the aspect with the most frequent feedback that versity, and selection of MICCAI papers. For the an improvement is desired. It is likely the category interested reader, the contributions and keynotes of where measures are the most straightforward to im- the MICCAI Hackathon are available online2, and plement and where we consider the potential for immediate improvements to be the highest. This is mostly due to the fact that MICCAI can bene- 1https://2020.miccai-hackathon.com/ 2https://www.youtube.com/playlist?list= fit from the experiences of measures taken at other PLflMBx361ODCPE3TK-ROFSblPMK5cNOto. conferences (e.g., NeurIPS (Pineau et al., 2020)). 2 2.1. Immediate Measures garding reproducibility should be enhanced. One Incorporate the reproducibility checklist way to achieve this could be an official list of open in the paper submission form (at intent-to- source MICCAI papers which could be either on submit & final submission). Incorporating a the MICCAI society website and/or even be incor- reproducibility checklist to the paper submission porated into the proceedings. might raise awareness to this important topic and Communicate best practices on repro- nudge authors towards adding important details to ducibility and code submission. Similar to the their manuscript. This measure has been effec- guidelines for authors and reviewers, there could be tive at NeurIPS 2019 (Pineau et al., 2020). The guidelines for reproducibility (or a dedicated sec- initial checklist can adhere to the list by Pineau tion in the guidelines for authors) pointing to the (2020) without adaptions. We consider it impor- reproducibility checklist (Pineau, 2020) as well as tant that the list is already present at the inten- resources like a code completeness checklist (Papers tion to submit, but with the possibility to update with Code, 2020). Similar efforts at other confer- the choices at the time of the final submission. By ences have already been made (e.g. at the EMNLP having such a list in the submission form, papers conference (EMNLP Conference, 2020b,a)) and do might be improved upon submission regarding re- not need to be recreated from scratch. producibility as people tend to forget, not intention- ally, certain aspects when writing their papers (e.g., 2.2. Long-term Measures software used, annotation protocol, pre- or post- processing steps, . . . ). Besides, priming the authors Introduce a reproducibility award. A re- at the intent-to-submit with the reproducibility producibility award might promote efforts towards checklist may create a so-called ”mere-measurement more reproducible papers. The award could con- effect” known in behavioural economy (see Green- sider the reproducibility checklist, and papers with wald et al. (1987); Morwitz and Fitzsimons (2004); public code could be (ideally automatically) com- Thaler and Sunstein (2009)) leading to more re- pared to the ML code completeness checklist on a producible manuscripts. Feedback of the authors specific date prior to the MICCAI conference, e.g., regarding the reproducibility checklist is essential. the camera-ready submission deadline. Specifica- A questionnaire could assess the acceptance of it tion of the exact procedure and forming of the com- in the community, but likely requires determining mittee could become one of the responsibilities of a reproducibility chair at MICCAI. the MICCAI reproducibility chair. Introduce a reproducibility chair at MIC- Code submission. Submitting code should be CAI. Having a reproducibility chair demonstrates considered as part of the paper submission (e.g., in that reproducibility is taken seriously and gives form of a ZIP archive). Tools for code anonymiza- credit to the individual in charge of it. This has tion exist (e.g., search and replace for certain key- been shown to be effective at NeurIPS, but also words) and the submitted code would not be re- examples like the MICCAI challenge chair indicate leased publicly. Such a code submission might fur- that this would have a positive effect. Amongst the ther raise awareness and provide the reviewers with chairs’ responsibilities would be to analyze the in- an additional way to assess papers (e.g., in case fluence of a reproducibility checklist over the years, of doubt regarding an implementation). The code gather feedback from authors and reviewers, and submission could even be delayed until the paper adapt the checklist (as well as potential other mea- matching has completed. sures) to the domain-specific needs of MICCAI. Best practices for evaluation. Best practices Include a statement of data availability in for evaluation could be released by the MICCAI the conference management toolkit (CMT). society. Separated into broad tasks (e.g., classifi- Including a separate statement in the CMT (e.g., cation, segmentation, reconstruction), lists of met- 250 words) on the availability of data could raise rics and their reference implementations could be awareness to release data. The statement could ad- provided. Further, guidelines for cross-validation, dress questions such as Is the used data publicly ablation, baselines, etc. could help authors to im- available, if not, why? Are comparable synthetic prove their papers without imposing extra effort datasets available? and would highly increase comparability across pa- Promote reproducibility efforts of authors. pers. We see that efforts to provide reference imple- The visibility of authors with increased efforts re- mentations of metrics are already going on for MIC- 3 CAI challenges and believe that these could most MICCAI conference. The format could be a paral- likely be embraced by the society once available. lel session, dedicated clinical presentations in each session, or as a panel discussion similar to MICCAI 2020. Here, also aspects of diversity could easily 3. Diversity be taken into account (i.e., different problems in different countries). The MICCAI conference attracts approximately 2500 attendees from all over the world every year. 3.2. Long-term Measures With this outreach, however, come questions about Open access proceedings. Open access to the the diversity of the attendees and the topics that are MICCAI conference proceedings would further pro- covered by the scientific program. The participants, mote diversity, as people and laboratories with lim- therefore, investigated the question: ited financial resources get free access to the latest research in MICCAI. Is MICCAI diverse enough? • Unconventional submission award. An spe- One out of five contributions covered to the cate- cific award for an unconventional submission might gory of diversity. The MICCAI conference already promote the submission of papers covering less es- puts much effort to increase the diversity of the tablished topics. Such papers might be at the pe- attendees with initiatives like Women in MICCAI, riphery of MICCAI, connect rather disjoint topics RSNA panel discussion, student travel awards, stu- or identify previously ignored, but clinically rele- dent participation awards, and alike. Often, these vant problems. Determining the award candidates initiatives promote equity, diversity, and inclusion might even be a side product of the categorization (EDI), and are appreciated by the MICCAI com- of accepted papers for the poster sessions. Papers munity. Also, the virtual setting of MICCAI 2020 that are not trivial to cluster might be good can- likely improved the diversity of attendees, and the didates. Alternatively, the reviewers or area chairs contemplated joint physical/virtual MICCAI 2021 could promote such exotic papers for this award, further substantiate these efforts of MICCAI to similar to the young scientist award. more diversity. Considering paper topic diversity, Encourage non-standard submissions in it appears to be unclear if and how more diver- the call for papers. The call for papers could in- sity should be increased. In general, we believe it clude a statement that encourages submission of un- is good that currently topic diversity is incorpo- derrepresented topics and/or the analysis of meth- rated on the level of oral presentations such that ods that provide new insights rather than novel underrepresented topics have a dedicated oral ses- methodology. sion(e.g., an oral session on reconstruction although Clarify the importance of achieving state- only a small fraction of the papers address recon- of-the-art. Challenges specifically exist to advance struction). the new state-of-the-art regarding a certain task. Hence, contributions targeting challenge datasets that ”merely” improve state-of-the-art might be 3.1. Immediate Measures better placed at the corresponding challenge. On Include a statement of clinical relevance in the other hand, papers with interesting ideas that the CMT. The clinical relevance of a MICCAI pa- do not achieve state-of-the-art might be worth be- per is often not entirely clear. Including a sepa- ing discussed at the main conference. An explicit rate statement in the CMT (e.g., 250 words) could statement in the guidelines for authors and review- help to understand the clinical relevance of a MIC- ers could clarify this circumstance. CAI paper, and might help to promote it. Such a Paper format guidelines. Further improve- statement could even be integrated into the review ments to the recently loosened page limit might be process (cf. Section 4). possible, as often size (and readability) of figures is Promote clinicians. Clinicians are a minor- sacrificed due to paper length restrictions. Ideally, ity at the MICCAI conference but ultimately our only the word count of the main text (without title, project initiators, data providers, collaborators, authors and affiliations, abstract, figure and table and customers. A dedicated clinical session where captions, acknowledgement, and references) is re- invited clinicians present their daily routine and stricted. The amount of words can be chosen such problems could promote clinical discussion at the that it meets approximately the current extent of 4 MICCAI papers. The figures and tables could be would be desirable. Further, being more transpar- limited in numbers, e.g., a total of five, but the ent about the selection of the best reviewers would captions itself should be without or weak restric- be necessary, e.g., What defines a good MICCAI tions in number of words. Moreover, the restriction paper review?. of two pages references could be further weakened. Stable paper scoring scale. The scoring sys- Such adaptions contribute to diversity (e.g., more tem of the review process was subject to heavy figures are possible to add) and also to reproducibil- changes in the last years, converging to the NeurIPS ity (e.g., more thorough descriptions of methods are scoring in the 2020 edition of MICCAI. The current possible). scoring is clear and well-understandable, and the rating scale should be kept constant over the years. Currently, only slight modification regarding word- 4. Selection ing might be necessary. Large changes from year to The papers that are submitted to the MICCAI year should be avoided to prevent confusion. conference undergo a scientific peer-review process 4.2. Long-term Measures to select the papers being presented at the confer- ence. This selection process is often subject to in- Interactive rebuttal phase. The rebuttal tensive discussion among the authors of papers, and phase is an aspect of the review process with might be perceived unfair in certain cases. There- high controversy, and several pros and cons were fore, in the category selection, two questions were raised during the MICCAI Hackathon. The rebut- investigated: tal might possibly be improved when an interaction between authors and reviewers is allowed. Such in- • What could MICCAI do to improve the review teractions could avoid misunderstandings. Impor- process? tant is that a reviewer sees only his own review and is not influenced by the other reviewers. Also, It • What defines a good MICCAI paper? should be possible to revise the review score. Ide- ally, reviewers could have a look at revised versions One out of five contributions was in the category of a submission but this might be difficult due to of selection. It is likely the category where measures the tight timeline of the conferences. Also, public are the most difficult to take and require careful reviews (similar to OpenReview) should not influ- considerations and implementation. Despite this, ence the official review process. there are certain points that would be fairly easy Adapt paper matching. The paper-to- and fast to address. reviewer matching is an aspect, where improvement is certainly possible but might also be difficult due 4.1. Immediate Measures to software constraints (TPMS, CMT). Ideally, the Open reviews. The reviews should be visible matching would consider the seniority of review- to everyone. After acceptance of the papers, the ers, e.g., match a senior and two less experienced reviews could be released on the MICCAI society reviewers. Tracking reviewers over the years, see website or in a public repository. This might foster next point below, might be required to implement less aggressively worded reviews. Further, acces- this. Further, domain conflicts (e.g., two review- sible reviews might be useful for new members of ers from the same laboratory) need to be restricted. the community, new reviewers, and also promote The unwillingness expressed by the reviewers in the further research ideas. At least, the reviewer of a bidding process should be automatically taken into paper should see the other reviews after paper deci- account in the matching algorithm (e.g. as an extra sion such that reviewers can learn from each other. set of constraints) and not require tweaking of the Better acknowledgement of reviewers. The weights and manual verification as of today. MICCAI conference acknowledges the best review- Introduce reviewer tracking. A tracking of ers every year in the closing session. However, the the reviewers over years might improve the review visibility of the best reviewers is limited to a very process (Lin et al., 2020). Senior reviewers could short window within all other topics in the closing be identified and matched together with less expe- session. Better acknowledging the best reviewers, rienced reviewers. Flagging reviewers (e.g. as fails e.g., permanently on the MICCAI society website to meet expectations, meets expectations, exceeds ex- or by bold-facing the reviewers in the proceedings pectations; see NeurIPS (Lin et al., 2020)) should 5 be possible. Both seniority and flags (from previ- The proposed measures are opinions of the orga- ous years) could be taken into account by the area nizers of the MICCAI Hackathon and are based on chairs when assigning papers. their own experience, discussions with the keynote Discussion platform for accepted papers. speakers and mentors, and the participant’s out- A public discussion on an official platform (similar comes. Therefore, the measures are subjective and to OpenReview) should be possible after the accep- do not represent the opinion of all stakeholders in- tance of papers. The platform should feature the volved at MICCAI. Further thoughts, ideas, and official and anonymized reviews but also allow to also disagreement with the suggestions are likely ask questions in an official communication channel present in the community. Therefore, it would be to the authors. essential to form a working group involving differ- Introduce clinical feedback. Implementing a ent stakeholders (PhD students, postdocs, faculty, clinical feedback might improve aspects of clinical MICCAI chairs, MICCAI board members, clini- relevance. Such a clinical feedback needs to be ad- cians, ...) to adapt and refine the proposed mea- ditional to the three reviewers (sort of a fourth re- sures further. Here, a Delphi process (Dalkey and viewer). For instance, it could base on the clinical Helmer, 1963) could help arrive at a consensus orig- statement in the CMT, and maybe involve having a inating from a representative group of stakeholders. look at the results (figures and tables). This short This white paper could serve as an inspiration and feedback could be taken into account by the area starting point for such a process. chair in case of borderline decisions or conflicting Besides the discussed topics, we also gained in- reviews. At least, it would provide interesting and sights about the format of a hackathon in the con- important insights on MICCAI papers from a clin- text of MICCAI. The collaborative nature of a ical perspective. A challenge, however, might be to hackathon introduces new possibilities to the MIC- enthuse enough clinicians to participate in such a CAI community and fits well in the setup of the review. A first step might be a dedicated clinical MICCAI conference. The first edition of the MIC- session as proposed in Section 3, promote clini- CAI Hackathon showed that this format can be cians. useful for exchange of ideas and finding creative solutions. Instead of competing, as known from challenges, the participants are required to collab- 5. Discussion orate and profit from the enriching environment The first edition of the MICCAI Hackathon cov- at the MICCAI conference. This collaborative as- ered the topics reproducibility, diversity, and selec- pect was highly endorsed by the participants, and tion of papers at the MICCAI conference. These also the keynote speakers and mentors were open topics are already known to the MICCAI commu- to this new format and very supportive. This en- nity, and are often informally discussed among col- ables community-driven impulses as, for example, leagues during the MICCAI conference. During showcased by the distilled measures. More infor- the MICCAI Hackathon, researchers of the MIC- mation on the hackathon’s organization and expe- CAI community collaborated and proposed possi- riences with it can be found in the supplementary ble solutions to challenges regarding these topics. material. The outcome of this hackathon during the satellite In conclusion, the first edition of the MICCAI events has been summarized in Section 2 to 4 into Hackathon shed some light into ways to improve the immediate and long-term measures that could po- aspects of reproducibility, diversity, and selection of tentially be implemented by future organizers of the papers at the MICCAI conference. The proposed MICCAI conference and the MICCAI society. Most measures can be seen as starting points and guide- of these measures are low-risk and might be applied lines for further discussions and actions to possibly with manageable effort. Indeed, after sharing the improve the MICCAI conference with regards to re- herein presented measures with the MICCAI board producibility, diversity, and selection of papers. and the organizers of MICCAI 2021, a reproducibil- ity checklist has already been implemented as part of the paper submission for MICCAI 20213. Acknowledgement

The MICCAI Hackathon was supported by the 3https://miccai2021.org/en/CALL-FOR-PAPERS.html Fund for the Promotion of Young Researchers 6 (Nachwuchsf¨orderungs-Projektpools) of the Inter- research (a report from the neurips 2019 reproducibility mediate Staff Association (Mittelbauvereinigung) program). arXiv preprint arXiv:2003.12206 . of the University of Bern, grant attributed to Thaler, R.H., Sunstein, C.R., 2009. Nudge: Improving deci- sions about health, wealth, and happiness. Penguin. FB and AJ. The organizers are very grateful to Anne Martel and Koustuv Sinha for giving the keynotes. The organizers are further very grateful Supplementary Materials to Anne Martel for sharing the tabular data regard- ing the paper selection at MICCAI 2020. A special The MICCAI Hackathon was organized by thank goes to Marleen de Bruijne, Mattias Hein- Fabian Balsiger, Alain Jungo, and Markus rich, Georg Langs, Ilkay˙ Oks¨uz,Lena¨ Maier-Hein, Rempfler. Being considered as part of the tuto- and Koustuv Sinha for the fruitful discussions dur- rial track by the MICCAI conference, the MICCAI ing the mentoring sessions. The organizers further hackathon proposal underwent review in this cate- thank the members of the award committee, Wilson gory (and faced some difficulties by not really be- Silva, Denis Schenk, Raphael Meier, and Mauricio ing a tutorial). Tutorials at the MICCAI satellite Reyes for judging the contributions. event days are either half-day or full-day events. For the hackathon, a full-day event seemed to be the minimum time required for the participants References to work on their ideas, ideally, the available time should be even longer. As the COVID-19 pan- Dalkey, N., Helmer, O., 1963. An Experimental Application demic led the MICCAI organizers to opt for a vir- of the DELPHI Method to the Use of Experts. Manage- ment Science 9, 458–467. doi:10.1287/mnsc.9.3.458. tual rather than a physical conference, the initial EMNLP Conference, 2020a. EMNLP 2020 Call for Papers. organizational plans were slightly adapted because URL: https://2020.emnlp.org/call-for-papers. collaboration in a virtual setting with participants EMNLP Conference, 2020b. Guest Post: Reproducibility originating from different time zones is very chal- at EMNLP 2020. URL: https://2020.emnlp.org/blog/ 2020-05-20-reproducibility. lenging. Therefore, in the end, we stretched the Greenwald, A.G., Carnot, C.G., Beach, R., Young, B., 1987. hackathon over several days and four phases, see Increasing voting behavior by asking people if they expect Figure 2. During the pre-hack phase, the keynotes to vote. Journal of Applied Psychology 72, 315. doi:10. 1037/0021-9010.72.2.315. and resources were made available to the partici- th Lin, H.T., Balcan, M.F., Hadsell, R., Ranzato, M., pants. On the 4 October, the actual event took 2020. What we learned from NeurIPS 2020 review- place within the frame of the satellite event days. ing process. URL: https://medium.com/@NeurIPSConf/ The keynote speakers and the mentors joined for what-we-learned-from-neurips-2020-reviewing-process-e24549eea38f. Ma, J., 2020. MICCAI Open Source Papers. URL: https: live sessions scattered throughout the day to an- //github.com/JunMa11/MICCAI-OpenSourcePapers. swer questions and interact with the participants. Martel, A.L., Abolmaesumi, P., Stoyanov, D., Mateus, D., Then, the participants had time to work on their Zuluaga, M.A., Zhou, S.K., Racoceanu, D., Joskow- topic(s) and to provide their presentations until the icz, L. (Eds.), 2020. Medical Image Computing and th th Computer Assisted Intervention – MICCAI 2020. vol- 8 October. On the 8 October, the second satel- ume 12261 of Lecture Notes in . lite event day, the presentations were made avail- Springer International Publishing, Cham. doi:10.1007/ able online and the participants joined for a public 978-3-030-59710-8. wrap-up session. MICCAI Society, 2020. Medical Image Computing and Computer Assisted Intervention Society Twitter To adapt to the virtual setting, the keynotes Tweet. URL: https://twitter.com/MICCAI_Society/ of Anne Martel and Koustuv Sinha were pre- status/1353636189572632576. recorded4. Anne Martel shared her view on or- Morwitz, V.G., Fitzsimons, G.J., 2004. The mere- ganizing the MICCAI conference from a program measurement effect: Why does measuring intentions change actual behavior? Journal of Consumer Psychology chair’s perspective including the aspects of selection 14, 64–74. and diversity of MICCAI papers. Koustuv Sinha Papers with Code, 2020. ML Code Completeness talked about reproducibility in machine learning Checklist. URL: https://github.com/paperswithcode/ releasing-research-code. form the perspective of a reproducibility co-chair at Pineau, J., 2020. The Machine Learning Reproducibility the NeurIPS conference. Therefore, both keynotes Checklist. URL: https://www.cs.mcgill.ca/~jpineau/ ReproducibilityChecklist.pdf. Pineau, J., Vincent-Lamarre, P., Sinha, K., Larivi`ere,V., 4The keynotes are available online at Beygelzimer, A., d’Alch´eBuc, F., Fox, E., Larochelle, https://www.youtube.com/playlist?list= H., 2020. Improving reproducibility in machine learning PLflMBx361ODCPE3TK-ROFSblPMK5cNOto. 7 Figure 2: A chronological overview of the MICCAI Hackathon in four phases. During the pre-hack phase, the keynotes and resources were made available to the participants. On the 4th October, the actual event took place within the frame of the satellite event days. The keynote speakers and the mentors joined for live sessions scattered throughout the day to answer questions and interact with the participants. Then, the participants had time to work on their topic(s) and to provide their presentations until the 8th October. On the 8th October, the second satellite event day, the presentations were made available online and the participants joined for a public wrap-up session. covered the topic of the hackathon. On the 4th Oc- participants were free to explore their own ques- tober, both keynote speakers joined for a live ques- tions and search for resources within the frame of tions and answers session. In addition, five mentors the hackathon’s topics. also joined throughout the day in 20 minutes time- The participants were required to register for slots to answer questions of the participants. The the MICCAI Hackathon. Registration was either mentors were selected based on their experience re- possible as an individual participant, as a partic- garding the hackathon’s topic and with equity, di- ipant that would like to be matched into a team, versity, and inclusion (EDI) criteria in mind. The and as a participant of an existing team. There- mentors were Marleen de Bruijne, Mattias Hein- ˙ ¨ fore, the participants specified their preferred topic rich, Georg Langs, Ilkay Oks¨uz,and Lena Maier- they would like to work on during the registration, Hein. Further, Koustuv Sinha joined for an addi- which made a matching into teams based on pref- tional mentoring session to the keynote questions erence possible. In total, 21 participants registered. and answers session. Six were individual participants, three participants A set of exemplary questions regarding the were matched into one team, and twelve partici- hackathon’s topic was prepared to facilitate the 5 pants registered as five teams. Most participants work of the participants . Resources were made were PhD students and the ethnicities were fairly available to provide further input to the partici- 6 diverse. To facilitate the collaboration among the pants . These resources comprised scientific pa- participants, a Discord server was set up that al- pers, links to blog posts and code repositories, lowed video conferencing, screen sharing, and text and also tabular data from the paper selection messaging. In the end, five teams handed in a process at MICCAI 2020. The latter included i) contribution. The overall impact of the MICCAI anonymized data on paper decisions and informa- Hackathon goes likely beyond the registered partic- tion on accepted papers, ii) anonymized data on ipants. For example, on the MICCAI 2020 confer- reviewers, i.e., bids, quotas, subject areas, affini- ence platform, approximately 80 attendees enlisted ties, and assignments to submitted papers, and iii) to the MICCAI Hackathon and might have watched anonymized data on area chairs, i.e., subject ar- the keynotes and contributions, which are still avail- eas, relevance and Toronto paper matching system able online. (TPMS) affinities to submitted papers. With these resources, the participants were further guided in Several lessons were learned in this first edition their work, and also had the opportunity to analyze of the MICCAI Hackathon. First, it is difficult to and explore real tabular data concerning paper se- find the target audience for such a new initiative, lection at MICCAI 2020. In general, however, the especially in a virtual setting. There were no ded- icated channels available to contact the MICCAI conference attendees directly to inform them about 5 Refer to https://2020.miccai-hackathon.com/ the initiative, leaving the organizers to rely on so- #topics. 6The resources are available online at https://airtable. cial media channels such as Twitter. This is ex- com/shrfRGA9FrHArvkNT. pected to be less problematic for subsequent edi- 8 tions of the MICCAI Hackathon. Second, it seems that participants are more drawn to solution-driven tasks/questions, e.g., create a machine learning re- producibility checklist specific for MICCAI, than open questions, e.g., how to ensure reproducibil- ity when the data cannot be shared. Third, the hackathon format does not entirely fit in the half- day and full-day setting of the satellite event days because it relies on collaborative work that cannot typically be performed during one satellite event day. Therefore, prolonging the hackathon over the entire conference duration was necessary. In an ideal case, the hackathon would not be restricted to the satellite event time slot(s) and could even be a separate category in the satellite events with a dedicated call for topics to be addressed.

9