1

CURRICULUM VITAE

Name: StevenSkiena Date of birth: January 30, 1961 Office: Dept. of Home: 6Storyland Lane StonyBrook University Setauket, NY 11733 StonyBrook, NY 11794 Tel.: (631) 689-5477 Tel.: (631)-632-8470/9026 email: [email protected] URL: http://www.cs.stonybrook.edu/˜skiena

Education

University of Illinois at Urbana-Champaign, Fall 1983 to Spring 1988. M.S. in Computer Science May 1985, Ph.D in Computer Science May 1988. Cumulative GPA 4.87/5.00. University of Virginia, Charlottesville VA, Fall 1979 to Spring 1983. B.S. in Computer Science with High Distinction, May 1983. Final GPA3.64/4.0, with 4.0+ GPAinComputer Science. Rodman Scholar,Intermediate Honors, Tau Beta Pi.

Employment

Founding Director,Institute for AI-DrivenDiscovery and Innovation (IAIDDI), August 2018 to date. College of Engineering and Applied Science, StonyBrook University. StonyBrook University,Empire Innovation Professor,fall 2018 to date. Distinguished Teaching Pro- fessor of Computer Science, spring 2009 to date. Professor of Computer Science, fall 2001 to spring 2009. Associate Professor of Computer Science, fall 1994 to spring 2001. Assistant Professor of Computer Science, fall 1988 to spring 1994. Also, Adjunct Professor of Applied Mathematics, spring 1991 to date. Affiliated faculty,Dept. of Biomedical Engineering, Institute for Advanced Computa- tional Science (IACS), and Graduate Program in Genetics. Yahoo Labs/Research, Visiting Scientist, NewYork, NY.summer 2015 to summer 2016. Chief Scientist, General Sentiment Inc. (www.generalsentiment.com) Woodbury NY spring 2009 to spring 2015. Hong Kong University of Science and Technology (HKUST), Visiting Professor,Dept. of Computer Science and Engineering, fall 2008 to summer 2009. Caesarea Edmond Benjamin de Rothschild Foundation Institute for Interdisciplinary Applications of Computer Science, University of Haifa, Israel, fall 2001 to spring 2002. DIMACS, Rutgers University,visitor,fall 1994 to spring 1995. Apple Computer,Advanced Technology Group, Cupertino CA, summer 1988. University of Illinois, Urbana IL, teaching assistant, fall 1983 to spring 1987. fellowship, summer 1987. research assistant, fall 1987 to spring 1988. MIT Lincoln Laboratories, Lexington MA, summers 1984 and 1986. Weizmann Institute of Science, RehovotIsrael, visitor,summer 1985. Western Electric Engineering Research Center,Princeton NJ, summer 1983. Bell Telephone Laboratories, Holmdel NJ, summer 1982. MiddlesexCounty College, Edison NJ, instructor,summer 1981. 2

Princeton Gamma-Tech, Princeton NJ, programmer,summer 1980, winters 1980 and 1981.

Awards

Distinguished Alumni Award, School of Engineering and Applied Science, University of Virginia, 2019. Fellow, American Association for the Advancement of Science, 2019. Communication, Information Technologies, and Media Sociology (CITAMS) Paper Award, 2017. Clifford James Geertz Prize for Best Article, American Sociological Association, 2014. Fulbright Scholar,2001-2002. IEEE Computer Science and Engineering Undergraduate Teaching Award, 2001. President’sand Chancellor’sAward for Excellence in Teaching, SUNY StonyBrook, 2000. ONR Young Investigator Award, 1993. EDUCOM Higher Education Software Award for Distinguished Mathematics Software, 1991. NSF Research Initiation Award, 1991. Fourth Prize, First International Design the Future Competition, Tokyo Japan, 1989. First Place, Apple Personal Computer of the Year 2000 Competition, 1988. University of Illinois Summer Fellowship in Computer Science, 1987. Honeywell Futurist Award, 1985. Winner,ACM ’85 Computer Chess Turing Test. The Daily Illini “Incomplete List of Teachers Ranked as Excellent by Their Students,”spring 1984 and 1985. First place, 21st Southeast ACM Student Papers Competition, 1983. U.VaSigma Xi Anniversary award for undergraduate research in Engineering Science, 1983.

Research Interests

Data science, design and applications, computational ,natural language processing and social media analysis.

Publications in Journals (1) Diet Modulates Brain Network Stability,aBiomarker for Brain Aging, in Young Adults (with. L. Mujica-Parodi, A. Amgalan, F.Sultan, B. Antal, X. Sun, A. Lithen, N. Adra, E. Ratai, C. Weistuch, S. Govindarajan, H. Strey, K.Dill, S. Stufflebeam, R. Veech, and K. Clarke) Proc. National Academy of Sciences (PNAS). March 3, 2020. www.pnas.org/cgi/doi/10.1073/pnas.1913042117 (2) Maximizing the Expected Value of a Lottery Ticket: HowtoSell and When to Buy (with A. Kim). Toappear in Chance. (3) Stereotypical Gender Associations in Language have Decreased Over Time (with J. Jones, R. Amin, and J. Kim) Sociological Science.January 7, 2020. 10.15195/v7.a1 Also Eastern So- ciological Society (ESS) Annual Meeting.Boston, MA, March 14-17, 2019 and 6th Interna- tional Conference on Computational Social Science (IC2S2 2020). (4) Prokaryotic Coding Regions have Little if anySpecific Depletion of Shine-Dalgarno Motifs (with Y.Yurovsky, M.Amin, J. Gardin, Y.Chen and B. Futcher). PLOS One,August 23, 3

2018. https://doi.org/10.1371/journal.pone.0202768 (5) Re-annotation of 12,495 16S rRNA3’ends and analysis of Shine-Dalgarno and anti-Shine- Dalgarno sequences (with M. Amin, Y.Yurovsky, Y.Chen and B. Futcher). PLOS One,Au- gust 23, 2018. https://doi.org/10.1371/journal.pone.0202767 (6) Latent Human Traits in the Language of Social Media: An Open-Vocabulary Approach (with V. Kulkarni, M. Kern, M. Kosinski, S. Matz, L. Unger and H. A. Schwartz). PLOS One, November 28 2018. https://doi.org/10.1371/journal.pone.0201703. (7) DeepBrowse: Serendipitous Browsing through Large Lists (with H. Chen and A. Anan- tharam). Information Systems (to appear). Invited paper from Tenth Int. Conf Similarity Search and Applications (SISAP 2017), Munich, Germany, October 4-6, 2017. (8) Who’sBigger? Where Computer Scientists Really Rank (with C. Ward). BSHM Bulletin: Journal of the British Society for the History of Mathematics 32 (2017) 257-264. (invited pa- per) Special issue, St. Andrews Mathematical Biographyconference. (9) A Codon-Shuffling-based method to Prevent Reversion During Production of Replication-De- fective Herpesvirus Stocks (with G. Li, C. Ward, R. Yeasmin, L. Krug, and J. C. Forrest). Na- tureScientific Reports 7, 44404 doi: 10.1038/sep44404 (2017). (10) Vector-Based Similarity Measurements for Historical Figures (with Y.Chen and B. Perozzi) Information Systems 2016) doi:10.1016/j.is.2016.07.001. Invited paper from Eighth Int. Conf Similarity Search and Applications (SISAP 2015), Springer-Verlag LNCS 9371, 179-190. Glasgow, Scotland. October 12-14, 2015. (11) The Books of Numbers: Quantifying Historical Trends in Numeracy(with Y.Chen). The Mathematical Intelligencer.38(2016) 67-73. (12) A Paper Ceiling: Explaining the Persistent Underrepresentation of Female Names in Printed News. (with E. Shor,A.van de Rijt, A. Miltsov, and V.Kulkarni), American Sociological Re- view 80 (2015) 960-984. DOI: 10.1177/0003122415596999 (13) Measurement of Average Decoding Rates of the 61 Sense Codons in Viv o.(with J. Gardin, R. Yeasmin, A. Yurovsky, Y.Cai, and B. Futcher). eLife 2014;10.7554/eLife.03735. Published October 27, 2014. DOI: http://dx.doi.org/10.7554/eLife.03735 (14) Is there a Political Bias? Female Subjects’sCoverage in Liberal and Conservative Newspa- pers. (with E. Shor,A.van de Rijt, C. Ward, and S. Askar). Social Science Quarterly 95 (2014) 1213-1228. DOI: 10.1111/ssqu.12091. (15) Deliberate reduction of glycoproteins HA and NAexpression of influenza virus leads to an ul- tra-protective liv e vaccine candidate in mice. (with C. Yang, B. Futcher,S.Mueller,and E. Wimmer). roc. National Academy of Sciences. (2013) 9481-9486. (16) Only Fifteen Minutes? The Social Stratification of Fame in Printed Media (with A. van de Ri- jt, C. Ward, and E. Shor). American Sociological Review 78 (2013) 266-289. Also, presented at the 106th Annual Meeting of the American Sociological Association. Las Veg as, August 20-23, 2011. (17) Time Trends in Printed News Coverage of Women, 1880-2008 (with E. Shor,A.van de Rijt, C. Ward, and A. Blank-Gomel) Journalism Studies,2013. Also, presented at the 106th Annu- al Meeting of the American Sociological Association. Las Veg as, August 20-23, 2011 and the 107th Annual Meeting of the American Sociological Association, Denver, CL, August 17-21, 2012. (18) Synthetic Sequence Design for Signal Location Search (with Y.-L. Lin and C. Ward) Algorith- mica 67(3): 368-383 (2013). Also, Proc. 16th International Conf.onComputational Molecu- lar Biology (RECOMB 2012).Barcelona Spain, April 21-24, 2012. (19) Identification of twofunctionally redundant RNAelements in the coding sequence of the po- liovirus RNApolymerase using computer generated designs and synthetic DNAsynthesis 4

(with Y.Song, Y.Liu, C. Ward, S. Mueller,B.Futcher,A.Paul, and E. Wimmer). Proc. Na- tional Academy of Sciences (PNAS) 109 (2012) 14301-14307. (20) Phase Balancing (with K. Wang and T.Robertazzi) Electric Power Systems Re- search.96(2013) 218-224. (21) Redesigning Viral Genomes IEEE Computer (invited paper) special issue on Computational- ly-DrivenExperimental Biology.45(March 2012) 47-53. (22) Computationally-recoded AAVRep 78 is Efficiently Maintained within an Adenovirus Vector (with V.Sitaraman, P.Hearing, C. Ward, D. Gnatenko, E. Wimmer,S.Mueller,and W.Bahou) Proc. National Academy of Sciences (PNAS), 108 (2011) 14294-14299. (23) Live Attenuated Influenza Vaccines by Computer-Aided Rational Design (with S. Mueller,R. Coleman, D. Papamichail, C. Ward, A. Nimnual, B. Futcher,and E. Wimmer) NatureBiotech- nology 20 (2010) 723-726. (24) Optimizing Restriction Site Placement for Synthetic Genomes (with P.Montes, H. Memelli, C. Ward, J. Kim, and J. Mitchell), Information and Computation 213 (2012) 59-69. Also: Proc. 21st Combinatorial Pattern Matching (CPM 2010), NewYork, June 21-23, 2010. (25) Watch the Story Unfold with TextWheel: Visualization of Large-Scale News Streams (with W. Cui, H. Zhou, H. Qu, and W.Zhang) ACMTrans. on Information Systems and Technology 3(2): 20 (2012). Also, preliminary version in FirstACM Workshop on Intelligent Visual Inter- faces for TextAnalysis (IVITA’10), 5-8, Hong Kong, February 7, 2010. (26) Expanding Network Communities from Representative Examples (with A. Mehler). ACM Tr ans. KnowledgeDiscovery from Data (TKDD), 3(2) (2009). Special Issue on Social Com- puting, Behavioral Modeling, and Prediction. (27) Pattern Matching with Address Errors: Rearrangement Distances (with A. Amir,Y.Aumann, G. Benson, A. Levy,O.Lipsky, E.Porat, and U. Vishne). J. Computer and System Sciences 75 (2009) 359-370. Also 17th ACM-SIAM Symp. on Discrete Algorithms (SODA’06). 1221-1229. Miami FL, January 20-22, 2006. (28) Crystallizing Short-Read Assemblies Around Seeds (with S. Hossain and N. Azimi). BMC Bioinformatics 10 (2009) Suppl 1:S15. Selected papers from Seventh Asia Pacific Bioinfor- matics Conference (APBC 2009), Beijing CN. January 13-16, 2009. (29) Virus attenuation by genome-scale changes in codon-pair bias (with J.R. Coleman, D. Pa- pamichail, B. Futcher,S.Mueller,and E. Wimmer). Science 320 (2008) 1784-1787. (30) Algorithms for Deterministic Call Admission Control of pre-stored VBR Video streams (with C. Tryfonas, D. Papamichail and A. Mehler) Journal of Multimedia Systems 4(2009) 161-181. (31) Concordance-Based Entity-Oriented Search (with M. Bautin) WebIntelligence and Agent Sys- tems: An International Journal 7(2009) 303-319. Special issue from IEEE/ACM Web Intelli- gence (WI-07),Silicon ValleyCA, November 2-5, 2007, 586-592. (32) Analysis of Airplane Boarding Times (with E. Bachmat, D. Berend, L. Sapir,and N. Stol- yarov). Operations Research 57 (2009) 499-513. (33) Combinatorial Dominance Guarantees for Problems with Infeasible Solutions, (with D. Berend and Y.Twitto). ACMTransactions on Algorithms 5 (2008). Also International Con- ference on Analysis of Algorithms (AofA07) Juan-les-pins, France. June 17-22, 2007. (34) Elevated CO2 Affects Soil Microbial Diversity Associated with Trembling Aspen (with C. Lesaulnier,D.Papamichail, S. McCorkle, B. Ollivier,S.Taghavi, D. Zak, and D. van der Lelie) Environmental Microbiology 10 (2008) 926-941. (35) Optimal boarding policies for thin passengers (with E. Bachmat, D. Berend, and L. Sapir). Advances in Applied Probability 39 (2007) 1098-1114. 5

(36) ImprovedBounds on Sorting with Length-Weighted Reversals (with M. Bender,D.Ge, S. He, H. Hu, R. Pinter and F.Swidan) J. Computer and System Sciences 74 (2007) 744-774. Also 15th ACM-SIAM Symp. on Discrete Algorithms (SODA’04). NewOrleans LA, January 2004. (37) Restricting SBH Ambiguity via Restriction Enzymes (with S. Snir), Discrete Applied Mathe- matics 155 (2007) 857-867. Special issue on computational biology.Also Second Workshop on Algorithms in Bioinformatics (WABI 2002) Springer-Verlag LNCS 2452, 404-417, 2002. (38) Reduction of the rate of poliovirus protein synthesis through large scale codon deoptimization causes virus attenuation of viral virulence by lowering specific infectivity (with S. Mueller,D. Papamichail, J.R. Coleman, and E. Wimmer). J. ofVirology 80 (2006) 9687-96. (39) Spatial Analysis of News Sources (with A. Mehler,Y.Bao, X. Li, and Y.Wang). IEEE Trans. Visualization and Computer Graphics 12 (2006) 765-772. Special issue on 12th IEEE Symp. on Information Visualization (InfoVis 2006). Baltimore MD, Oct. 29 - November 3, 2006. (40) Meta-analysis based on control of False Discovery Rate: Combining yeast ChiP-chip data sets (with S. Pyne and B. Futcher). Bioinformatics. 20 (2006) 2516-22 (41) Analysis of Airplane Boarding via Space-time Geometry and Random Matrix Theory (with E. Bachmat, D. Berend, L. Sapir,and N. Stolyarov). Journal of Physics A: Mathematical and General 39 (2006) L453-L459. (42) Two Proteins for the Price of One: The Design of Maximally Compressed Coding Sequences (with B. Wang, D. Papamichail, and S. Mueller). Natural Computing 6 (2007) 359-370. Also in 11th International Meeting on DNAComputing,June 6-9, 2005, Lecture Notes in Computer Science, 3892 (2006) 387-398. London, Ontario, Canada (43) The Cell Cycle Regulated Genes of Schizosaccharomyces pombe (with A. Oliva,A.Rose- brock, F.Ferrezuelo, S. Pyne, B. Futcher,and J. Leatherwood) PLoS Biology (2005) 3(7):e225. (44) Some Lower Bounds on Geometric Separability Problems, (with E. Arkin, F.Hurtado, J. Mitchell, and C. Seara). Int. J.Comp. Geometry and Applications (2006) 16(1) 1-26. (45) CopyCorrection and Concerted Evolution in the Conservation of Yeast Genes (with S. Pyne and B. Futcher). Genetics (2005) 170 1501-1513. (46) Least Common Ancestors on Directed Acyclic Graphs, (with M. Bender,G.Pemmasani, and P. Sumazin), J. ofAlgorithms 57 (2005) 75-94. Also Twelfth ACM-SIAM Symp. on Discrete Algorithms (SODA2001), January 7-9, 2001, pp. 845-854. (47) Integrating Microarray Data by Consensus Clustering (with V.Filkov) International Journal of Artificial Intelligence Tools,13(2004) 863-880. Special issue on 15th IEEE Int. Conf. Tools with Artificial Intelligence (ICTAI ’03),418-426, Sacramento CA, November 3-5, 2003. (48) Efficient Data Structures for Maintaining Set Partitions (with M. Bender and S. Sethia). Ran- dom Structures and Algorithms 25 (2004) 43-67. Also, Seventh Scandinavian Workshop on Algorithm Theory (SWAT ’00),July 2000. (49) Algorithms for Testing that DNAWord Designs Avoid Unwanted Secondary Structure (with D. Dees, L. Slaybaugh, Y.Zhao, A. Condon, and B. Cohen). Natural Computing 2(2003) 391-415, special issue on Eighth International meeting on DNABased Computers(DNA8), Hokkaido University,Japan, June 10-13, 2002. (50) When Can You Fold a Map? (with E. Arkin, M. Bender,E.Demaine, M. Demaine, J. Mitchell and S. Sethia), Computational Geometry: Theory and Applications 29 (2004) 23-46. Also Proc. 7th Workshop on Algorithms and Data Structures (WADS),Providence, RI, USA August 8-10, 2001. Springer-Verlag LNCS 2125, 401-413. (51) Visualizing Objects with Mirrors (with F.Hurtado, M. Noy, J.-M. Robert, and V.Sacristan). Computer Graphics Forum 23 (2004) 157-166. Translated into Spanish as ‘‘Visualizacio’n de Objectos Mediante Espejos’’for CEIG’96, Valencia, Spain, June 1996. 6

(52) Deconvolving Sequence Variation in Mixed DNAPopulations, (with A. Wildenbergand P. Sumazin). J. Computational Biology 10 (2003) 635-652. Also, Sixth International Conf.on Computational Molecular Biology (RECOMB 02),Washington DC, April 18-21, 2002. (53) Natural selection and algorithmic design of mRNA. (with B. Cohen). J. Computational Biol- ogy 10 (2003) 419-432. Also Sixth International Conf.onComputational Molecular Biology (RECOMB 02),Washington DC, April 18-21, 2002. (54) Microarray Synthesis through Multiple-Use PCR Primer Design (with R. Fernandes). Bioin- formatics 18 (2002) S128-S135. Special issue on Tenth International Conf.onIntelligent Sys- tems for Molecular Biology (ISMB 2002).Edmonton, Alberta, August 3-7, 2002. (55) The Lazy Bureaucrat Scheduling Problem (with E. Arkin, M. Bender,and J. Mitchell). Infor- mation and Computation 184 (2003) 129-146. Also, Workshop on Algorithms and Data Structures (WADS ’99), Vancouver,B.C. August 1999. (56) Designing Better Phages, Bioinformatics 17 (2001) S253-261. Also Ninth International Conf. on Intelligent Systems for Molecular Biology (ISMB 2001),Copenhagen, Denmark, July 21-25, 2001. (57) Dealing with Errors in Interactive Sequencing by Hybridization (with V.Phan). Bioinformat- ics 17 (2001) 862-870. Special issue of ThirdGeorgia Tech-Emory International Conference on Bioinformatics,Atlanta, Georgia, November 15-18, 2001. (58) Analysis Techniques for Microarray Time-Series Data (with V.Filkov and J. Zhi). Journal of Computational Biology 9(2002) 317-330, special issue Fifth International Conf.onComputa- tional Molecular Biology (RECOMB 01),Montreal, Quebec, April 21-24, 2001. (59) Shift Error Detection in Standardized Exams (with P.Sumazin). J. Discrete Algorithms,2, (2004) 313-331. Also, 11th Symp. on Combinatorical Pattern Matching (CPM ’00). Springer-Verlag LNCS 1848, 264-276. Montreal, CA, June 2000. (60) Identifying Gene Regulatory Networks from Experimental Data (with T.Chen and V.Filkov) Parallel Computing,special issue on NewTrends in High Performance Computing, 27 (2001) 141-162. Also, Proc. RECOMB ’99,Lyon France, April 1999, 94-103. (61) Optimizing Combinatorial Library Construction via Split Synthesis (with B. Cohen) Journal of Combinatorial Chemistry,2(2000) 10-18. Also, Proc. RECOMB ’99,Lyon France, April 1999, 124-133. (62) A Case Study in Genome-LevelFragment Assembly (with T.Chen). Bioinformatics,16 (2000) 494-500. (63) Efficiently Computing and Updating Triangle Strips (with J. El-Sana, F.Evans, A. Varshney, S. Skiena, and E. Azanli). Computer-Aided Design,special issue on Multiresolution Geomet- ric Models., 32 (2000) 753-772. (64) Matching for Run-Length Encoded Strings (with A. Apostolico and G. Landau). Journal of Complexity,15(1999) 4-16. Special issue for papers from Sequences ’97,Positano Italy,June 11-13, 1997. (65) Graph Drawing and Manipulation with LINK, (with J. Berry,N.Dean, M. Goldberg, and G. Shannon). SoftwarePractice and Experience,special issue on algorithm engineering, 30 (2000) 1285-1302. Also Graph Drawing ’97,Springer-Verlag LNCS, Rome, September 1997. (66) Filling aPennyAlbum (with Shiyong Lu). Chance,13-2 (Spring 2000) 25-28. (67) On the Maximum Scatter TSP (with E. Arkin, Y.-J. Chiang, J. Mitchell, and T.Yang). SIAM J. Computing.29(1999) 515-544. Also, Proc. Eighth ACM-SIAM Symp. on Discrete Algo- rithms (SODA) pp. 211-220, 1997. 7

(68) Local Rules for Protein Folding on a Triangular Lattice and Generalized Hydrophobicity in the HP Model (with R. Agarwala, S. Batzoglou, V.Dancik, S. Decatur,M.Farach, S. Hannen- halli, and S. Muthukrishnan). J. Computational Biology,RECOMB Special Issue, 4 (1997) 275-296. Also Proc. Eighth ACM-SIAM Symp. on Discrete Algorithms (SODA) pp. 390-399 and Proc. FirstInternational Conf.onComputational Molecular Biology (RECOMB 97). (69) On Minimum-Area Hulls (with E. Arkin, Y.J. Chiang, M. Held, J. Mitchell, V.Sacristan, and T. Yang). Algorithmica 21 (1998) pp. 119-136. Also, Fourth European Symposium on Algo- rithms (ESA ’96), Lecture Notes in Computer Science, v.1136, 334-348, 1996. (70) Hamiltonian Triangulations for Fast Rendering (with E. Arkin, M. Held, and J. Mitchell). The Visual Computer 12 (1996) pp. 429-444. Also, Second European Symposium on Algorithms, Springer-Verlag Lecture Notes in Computer Science 855 36-47. (71) Principles and Practice of Unification Factoring, (with S. Dawson, C.R. Ramakrishnan, and T. Swift), ACMTrans. on Programming Languages(TOPLAS), 18 (1996) 528-563. Conference version: Unification Factoring for Efficient Execution of Logic Programs (with S. Dawson, C.R. Ramakrishnan, I.V.Ramakrishnan, K. Sagonas, T.Swift, and D.S. Warren). 22nd ACM Symposium on Principles of Programming Languages(POPL ’95),pp. 247-258, San Francis- co CA, January 23-25, 1995. (72) Sorting with Fixed-Length Reversals (with T.Chen). Discrete Applied Mathematics,special issue on Computational Biology, 71 (1996) 269-295. Also, report 95-5, Department of Com- puter Science, State University of NewYork, StonyBrook, May 1995 and DIMACS Report 95-18, Rutgers University. (73) Dialing for Documents: an Experiment in Information Theory (with Harald Rau). Journal of Visual Languagesand Computing 7 (1996) 79-95. Also, Seventh ACM SIGGRAPH Sympo- sium on User Interface Softwareand Technology (UIST ’94),Marina Del Rey, California, November 2-4, 1994, 147-155. (74) Positional Sequencing by Hybridization (with S. Hannenhalli, W.Feldman, H. Lewis, and P. Pevzner). Computer Applications in the Biological Sciences (CABIOS) 12 (1996) 19-24. Al- so, DIMACS Report 95-19, Rutgers University,June 1995. (75) Reconstructing Strings from Substrings, (with Gopalkrishnan Sundaram), J. Computational Biology 2 (1995) 333-353. Also Workshop on Data Structures and Algorithms (WADS ’93), Springer-Verlag Lecture Notes in Computer Science 709 565-576 and report 93-10, Depart- ment of Computer Science, State University of NewYork, StonyBrook, June 1993. (76) Recognizing Polygonal Parts from Width Measurements (with E. Arkin, M. Held, and J. Mitchell). Computational Geometry: Theory and Applications 9(1998) 237-246. Also, Sev- enth Canadian Conf.Computational Geometry 199-204, Quebec CA (1995). (77) Decision Trees for Geometric Objects (with E. Arkin, H. Meijer,J.Mitchell, and D. Rappa- port). Int. J.Computational Geometry and Applications 8(1998) 343-363. Also, Proc. Ninth ACMSymposium on Computational Geometry,369-378, San Diego, CA, May 1993. (78) Reconstructing Polygons from X-Rays, (with Henk Meijer), Geometriae Dedicata 61 (1996) 191-204. Also Fifth Canadian Conference on Computational Geometry,381-386, Waterloo, Ontario CA, 1993. (79) Recognizing Small Subgraphs. (with Gopalkrishnan Sundaram). Networks 25 (1995) 183-191. Also, Report 92-16, Department of Computer Science, State University of New York, StonyBrook, September 1992. (80) A Partial Digest Approach to Restriction Site Mapping, (with Gopalkrishnan Sundaram), Bul- letin of Mathematical Biology 56 (1994) 275-294. Also, Proc. First Int. Conf.Intelligent Sys- tems for Molecular Biology,Washington DC, pp. 362-370, AAAI/MIT Press, Menlo Park CA, 1993 and report 92-15, Department of Computer Science, State University of NewYork, StonyBrook, August 1992. 8

(81) Algorithms for Square Roots of Graphs (with Yaw-Ling Lin). SIAM J.Discrete Mathematics 8 (1995) 99-118. Also, Second Annual International Symposium on Algorithms,Taipei, Tai- wan, December 16-18, 1991. Springer-Verlag Lecture Notes in Computer Science 557 12-21 and report 91-11, Department of Computer Science, State University of NewYork, Stony Brook, June 1991. (82) Complexity Aspects of Visibility Graphs, (with Yaw-Ling Lin), Int. J.Computational Geome- try and Applications 5 (1995) 289-312. Report 92-08, Department of Computer Science, State University of NewYork, StonyBrook, May 1992. (83) Model-Based Probing Strategies for Convex Polygons (with Eugene Joseph). Computational Geometry: Theory and Applications 2 (1992) 209-221. Also, report 90-35, Department of Computer Science, State University of NewYork, StonyBrook, October 1990. (84) An Anomaly Concerning Ties in Lotto-likeGames (with Gopalakrishnan Sundaram). Applied Mathematics Letters 6-2 (1993) 55-59. Also, Report 91-16, Department of Computer Science, State University of NewYork, StonyBrook, August 1991. (85) Gracefully Labeling Prisms (with Jen-Hsin Huang). ARS Combinatoria 38 (1994) 225-242. Also, report 90-33, Department of Computer Science, State University of NewYork, Stony Brook, October 1990. (86) Precise Rank-Ordering of all Paths through an Interconnection Network (with Yaw-Fu Jan and Armen Zemanian). IEEE Trans. Circuits and Systems 39 (1992) 1011-1014. Also, 1991 IEEE International Symposium on Circuits and Systems,Singapore, June 11-14, 1991. (87) Interactive Reconstruction via Probing, (invited paper) Proceedings of the IEEE, 80 (1992) 1364-1383. Also, report 90-20, Department of Computer Science, State University of New York, StonyBrook, June 1990. (88) Probing Convex Polygons with Half-Planes. J. Algorithms 12 (1991) 359-374. Also, report UIUCDCS-R-87-1380, Department of Computer Science, Urbana IL. October 1987. (89) Tight Bounds on a Problem of Lines and Intersections (with Micha Sharir). Discrete Mathe- matics 89 (1991) 313-314. (90) Counting k-Projections in a Point Set. J. Combinatorial Theory,Series A 55 (1990) 153-160. (91) Problems in Geometric Probing. Algorithmica 4 (1989) 599-605. (92) Reconstructing Graphs from Cut-set Sizes. Information Processing Letters 32 (1989) 123-127. (93) Searching on a Tape (with Scott Hornick, SanjeevMaddila, Ernst Mucke, Harald Rosenberg- er,and Ioannis Tollis). IEEE Trans. Computers 39 (1990) 1265-1272. Also, report ACT-89, Coordinated Science Laboratory,Urbana IL, February 1988. (94) Encroaching Lists as a Measure of Presortedness. BIT, 28 (1988) 775-784. (95) A Fairer Scoring System for Jai-alai. Interfaces 18-6 (November/December 1988) 35-41. (96) Probing Convex Polygons with X-rays (with Herbert Edelsbrunner). SIAM J.Computing 17 (1988) 870-882. Also, report UIUCDCS-R-86-1306, Department of Computer Science, Ur- bana IL, November 1986. (97) Eight Pieces Cannot CoveraChess Board (with Arch D. Robison and Brian J. Hafner). Com- puter Journal 32 (1989) 567-570. (98) Tablet: Personal Computer of the Year 2000 (with Bartlett Mel, Stephen Omohundro, Arch Robison, Kurt Thearling, , and LukeYoung). Communications of the ACM, 31,(June 1988) 638-646, with excerpts in Omni 10-9 (June 1988) 118,126 and EOS (Dutch translation) 7-1 (January 1990) 72-73. Also, report CSG-85, Coordinated Science Laboratory, Urbana IL and UIUCDCS-R-88-1406, Department of Computer Science, Urbana IL February 1988. 9

(99) On the Number of Furthest Neighbour Pairs in a Point Set (with Herbert Edelsbrunner). Amer. Math. Monthly 96 (1989) 614-618. Also, report UIUCDCS-R-86-1312, Department of Com- puter Science, Urbana IL, December 1986. (100) Further Evidence for Randomness in π. ComplexSystems 1 (1987), 361-366.

Publications in Conference Proceedings

(101) Chapter Captor: TextSegmentation in Novels (C. Pethe, A. Kim, and S. Skiena) Empirical Methods in Natural Language Processing (EMNLP 2020), Punta Cana, Dominican Republic, November 16-20, 2020. (102) What Time is It? Temporal Analysis of Novels (A. Kim, C. Pethe, and S. Skiena) Empirical Methods in Natural Language Processing (EMNLP 2020), Punta Cana, Dominican Republic, November 16-20, 2020. (103) Fast Spatial Autocorrelation (with A. Amgalan and L. Mujica-Parodi) 20th IEEE Int. Conf.on Data Mining (ICDM 2020), Sorento IT,November 17-20, 2020. (104) Online AUCOptimization for High-dimensional Sparse Datasets (with B. Zhou and Y.Ying) 20th IEEE Int. Conf.onData Mining (ICDM 2020), Sorento IT,November 17-20, 2020. (105) Marking Streets to Improve Parking Density (with C. Xu) Urban ComplexSystems 2020, ComplexSystem Society,December 4-11, 2020. ArXiv, http://arxiv.org/abs/1503.09057, March 27, 2015. (106) ImprovedMapReduce Load Balancing through Distribution-Dependant Hash Function Opti- mization (with Z. Ahmad, S. Duppala, and R. Chowdhury) IEEE Int. Conf.onParallel and Distributed Computing (ICPADS 2020), Hong Kong, December 2-4, 2020. (107) The Trumpiest Trump? Identifying aSubject’sMost Characteristic Tweets (with C. Pethe) mpirical Methods in Natural Language Processing (EMNLP 2019), Hong Kong, November 3-7, 2019. (108) Fast and Accurate Network Embeddings via Sparse Random Projection (with H. Chen, S. Fa- had Sultan, M. Chen, and Y.Tian) 28th ACM Conf.Information and KnowledgeManagement (CIKM 2019), Beijing China, November 3-7, 2019. (109) Learning to Represent Bilingual Dictionaries (with M. Chen, Y.Tian, H. Chen, K.-W.Chang, and C. Zaniolo) Conf.onComputational Natural Language Learning (CoNLL 2019), Hong Kong, November 3-4, 2019. (110) Pre-Phaser: Precise Cell-Cycle Phase Detector for scRNA-seq (with A. Yurovskyand B. Futcher) 10th ACM Conf.Bioinformatics, Computational Biology,and Health Informatics (ACM-BCB 2019). Niagra Falls NY,September 7-10, 2019. (111) MediaRank: Computational Ranking of Online News Sources (with J. Ye). 25th ACM SIGKDD Conf.KnowledgeDiscovery and Data Mining (KDD 2019), Anchorage AK, August 2019. (112) The Secret LivesofNames: Name Embeddings from Social Media (with J. Ye) 25th ACM SIGKDD Conf.KnowledgeDiscovery and Data Mining (KDD 2019), Anchorage AK, August 2019. (113) Data Races and the Discrete Resource-Time TradeoffProblem with Resource Reuse over Paths (with R. Das, S. Tsai, S. Duppala, J. Lynch, E. Arkin, R. Chowdhury,and J. Mitchell) 31st ACM Symp. on Parallelism in Algorithms and Architectures (SPAA 2019). Phoenix AZ. June 22-24, 2019. (114) Social Relation Inference by Label Propagation (with H. Chen, Y.Tian, X. Sun, M. Chen, and B. Perozzi) 41st European Conf.onInformation Retrieval (ECIR 2019). Cologne, Germany, April 14-18, 2019. Springer-Verlag LNCS 11437. Preliminary version: 4th Int. Workshop on 10

Machine Learning on Graphs (MLG 2018). London, August 19-23, 2018. (115) Multi-viewModels for Political Ideology Detection of News Articles (with V.Kulkarni, J. Ye, and W.Wang) Empirical Methods in Natural Language Processing (EMNLP 2018), Brussels, Belgium. October 31-November 4, 2018. (116) Enhanced Network Embeddings via Exploiting Edge Labels (with H. Chen, Y.Tian, M. Chen, X. Sun, and B. Perozzi) 27th ACM Conf.Information and KnowledgeManagement (CIKM 2018), Turin, Italy,October 22-26, 2018. Full version at http://arxiv.org/abs/1809.05124. (117) DeepAnnotator: Genome Annotation With Deep Learning (with R. Amin, A. Yurovsky, and Y. Tian). 9th ACM Conf.Bioinformatics, Computational Biology,and Health Informatics (ACM-BCB 2018). Washington DC, August 29-September 1, 2018. (118) Simple Neologism Based Domain Independent Models to Predict Year of Authorship of Doc- uments (with V.Kulkarni, Y.Tian, and P.Dandiwala). 27th International Conference on Computational Linguistics (COLING 2018) Santa Fe NM, August 20-26, 2018. Also South- ern California Natural Language Symposium (SoCal 2018). Irvine CA, April 6, 2018. (119) Co-training Embeddings of Multilingual Knowledge Graphs and Entity Descriptions for Se- mi-supervised Knowledge Alignment (with M. Chen, Y.Tian, C. Zaniolo, and K. Chang) 27th International Joint Conference on Artificial Intelligence and the 23rdEuropean Conference on Artificial Intelligence (IJCAI-ECAI 2018), Stockholm, Sweden, July 13-19, 2018 (120) Syntax-Directed Variational Autoencoder for Structured Data (with H. Dai, Y.Tian, B. Dai, and L. Song). 6th Int. Conf.onLearning Representations (ICLR 2018), Vancouver BC Cana- da, April 30-May 3, 2018. Also, arXiv:1802.08786. (121) HARP: Hierarchical Representation Learning for Networks (with H. Chen, B. Perozzi, and Y. Hu). Thirty-Second AAAI Conference on Artificial Intelligence (AAAI 2018), NewOrleans LA. February 2-7, 2018. Preliminary version: 3th Int. Workshop on Machine Learning on Graphs (MLG 2017). Halifax, Nova Scotia, August 13-17 2017. (122) Syntax-Directed Variational Autoencoder for Molecule Generation (with H. Dai, Y.Tian, B. Dai, and L. Song). Workshop on Machine Learning for Molecules and Materials (best paper aw ard), 31st Conf.onNeural Information Processing Systems (NIPS 2017). Long Beach CA, December 4-9, 2017. (123) Nationality Classification using Name Embeddings (with J. Ye, S. Han, Y.Hu, B. Coskun, and M. Liu). 26th ACM Conf.Information and KnowledgeManagement (CIKM 2017), Singa- pore, November 6-10, 2017. Also, presented at Yahoo TechPulse 2016. (124) Optimal Codon Pair Bias Design (with N. Donoghue, J. Gardin, and B. Futcher) IEEE Int. Conf.Bioinformatics and Biomedicine (BIBM 2017), Kansas City MO, November 13-16, 2017. (125) Generating Look-alikeNames for Security Challenges (with S. Han, Y.Hu, S. Skiena, B. Coskun, M. Liu, and J. Perez). 10th ACM Workshop on Artifical Intelligence and Security (AISec 2017), Dallas TX, November 3, 2017. (126) Walklets: Skip-a-Step for Multiscale Graph Embeddings (with B. Perozzi, V.Kulkarni, and H. Chen). IEEE/ACM Conf.Advances in Social Networks Analysis and Mining (ASONAM 2017).Sydney, Australia, July 31-August 3, 2017. (127) NanoBLASTer: Fast Alignment and Characterization of Oxford Nanopore Single Molecule Sequencing Reads (with R. Amin and M. Schatz) 6th IEEE International Conference on Com- putational Advances in Bio and Medical Sciences (ICCABS).Atlanta GA, October 13-15, 2016. (128) Freshman or Fresher? Quantifying Geographic Variation of Internet Language (with V.Kulka- rni and B. Perozzi) 10th Int. AAAI Conference on Web and Social Media (ICWSM 2016) Cologne, Germany, May 17-20, 2016. 11

(129) Optimizing Read Reversal for Sequence Compression (with Z. Sichen, L. Zhao, Y.Liang, M. Zamani, R. Patro, R. Chowdhury,E.Arkin, and J. Mitchell). Workshop on Algorithms in Bioinformatics (WABI 2015). Atlanta GA, Sept. 10-12, 2015. (130) Statistically Significant Detection of Linguistic Change (with V.Kulkarni, R. al-Rfou, and B. Perozzi). 24th Int. World Wide Web Conference (WWW 2015) Florence Italy,May 18-22, 2015. (131) Polyglot-NER: Massively Multilingual Named Entity Recognition (with R. Al-Rfou, V. Kulkarni, and B. Perozzi). SIAM Int. Conf. on Data Mining (SDM 2015). Vancouver,CA April 30-May 2, 2015. (132) DeepWalk: Online Learning of Social Representations (with B. Perozzi and R. al-Rfou). 20th ACMSIGKDD Conf.KnowledgeDiscovery and Data Mining (KDD 2014), NewYork NY, August 2014. (133) Constructing Sentiment Lexicons for All Major Languages (with Y.Chen). 52nd Meeting of the Association for Computational Linguistics (ACL 2014), Baltimore MD, June 22-27 2014. (134) Inducing Language Networks from Continuous Space Word Representations (with B. Perozzi, R. Al-Rfou, and V.Kulkarni) 5th Workshop on ComplexNetworks (CompleNet 2014). Bologna Italy,March 12-14, 2014. (135) Designing Autocorrelated Genes (with R. Yeasmin, J. Tithi, and J. Chen). Fourth ACM Int. Conf.onBioinformatics, Computational Biology,and Biomedical Informatics (ACM BCB 2013), Washington DC, Sept. 22-25, 2013. (136) Polyglot: Distributed Word Representations for Multilingual NLP (with R. Al-Rfou and B. Perozzi), 17th Conf.Computational Natural Language Learning,Sofia, Bulgaria August 8-9, 2013. (CoNLL 2013). (137) The Expressive Power of Word Embeddings (with Y.Chen, B. Perozzi, and R. Al-Rfou). Int. Conf.onMachine Learning (ICML 2013) Workshop on Deep Learning for Audio, Speech, and Language Processing. Atlanta GA, June 16, 2013. (138) SpeedRead: AFast Named Entity Recognition Pipeline (with R. al-Rfou). 24th Int. Conf.on Computational Linguistics (COLING 2012), Mumbai India, December 8-15, 2012. (139) Designing RNASecondary Structures in Coding Regions. (with Rukhsana Yeasmin). LNBI 7292 299-314. Int. Symp. Bioinformatic Research and Applications (ISBRA 2012). Dallas Te xas, May 21-23, 2012. (140) Constructing Orthogonal de Bruijn Sequences (with Y.l-Lin, C. Ward, and B. Jain). Workshop on Algorithms and Data Structures (WADS 2011). Brooklyn NY,August 15-17, 2011. (141) Trading Strategies to Exploit News Sentiment (with W.Zhang). Int. Conf Weblogs and Social Media (ICWSM 2010),Washington DC. (142) The Wisdom of Bookies? Sentiment Analysis against the NFL Point Spread (with F.Hong). Int. Conf Weblogs and Social Media (ICWSM 2010),Washington DC. (143) Access: News and Blog Analysis for the Social Sciences (with M. Bautin, C. Ward, and A. Patil) 19th Int. World Wide Web Conference (WWW 2010), Raleigh NC, April 26-30, 2010. (144) Name-Ethnicity Classification from Open Sources (with A. Ambekar,C.Ward, J. Mo- hammed, and S. Male) 15th ACM SIGKDD Conf.KnowledgeDiscovery and Data Mining (KDD 2009), Paris France, June 28-July 1, 2009. (145) Improving Movie Gross Prediction Through News Analysis (with W.Zhang) IEEE/ACM Int. Conf.Web Intelligence and Intelligent Agent Technology (WI 2009), Milan Italy,September 15-18, 2009. Also, abstract in 29th Int. Symposium on Forecasting,Hong Kong, June 21-24, 2009. 12

(146) Identifying Differences in News Coverage Between Cultural/Ethnic Groups (with C. Ward and M. Bautin) News Analysis Workshop of IEEE/ACM Int. Conf.Web Intelligence and Intel- ligent Agent Technology (WI 2009), Milan Italy,September 15-18, 2009. (147) The Embroidery Problem (with G. Sabhnani, I. Kostisyna, J. Mitchell, and E. Arkin) Twenti- eth Canadian Conference on Computational Geometry (CCCG ’08) Montreal CA, August 11-13, 2008. (148) International Sentiment Analysis for News and Blogs (with M. Bautin and L. Vijayarenu). Second Int. Conf.onWeblogs and Social Media (ICWSM 2008), Seattle WA, March 26-28, 2008. (149) Large-Scale Sentiment Analysis for News and Blogs (with N. Godbole and M. Srinivasaiah). Int. Conf.onWeblogs and Social Media (ICWSM 2007), DenverCO, March 26-28, 2007. (150) Improving Usability Through Password-Corrective Hashing (with A. Mehler). 13th Symp. of String Processing and Information Retrieval (SPIRE ’06), Glasgow, Scotland. October 11-13, 2006. (151) Identifying Co-referential Names Across Large Corpra (with L. Lloyd and A. Mehler). Proc. Combinatorial Pattern Matching (CPM 2006).(invited paper) Springer-Verlag LNCS 4009 (2006) 12-23. (152) Newspapers vs. Blogs: Who Gets the Scoop? (with L. Lloyd and P.Kaulgud). Computational Approaches to Analyzing Weblogs (AAAI-CAAW2006), AAAI Technical Report SS-06-03 117-124. Stanford University,March 27-29, 2006. (153) Question Answering with Lydia (with J. Kil and L. Lloyd). 14th TextREtrieval Conference (TREC 2005), NIST GaithersburgMD, November 15-18, 2005. (154) Lydia: A System for Large-Scale News Analysis (with L. Lloyd and D. Kechagias). 12th Symp. of String Processing and Information Retrieval (SPIRE ’05), Buenos Aires, Argentina, November 2-4, 2005. Lecture Notes in Computer Science, 3772 (2005) 161-166. Also VJor- nadas de Matema’tica Discreta y Algori’tmica (in Spanish translation) (155) Airplane Boarding, Disk Scheduling, and Space-Time Geometry (with E. Bachmat, D. Berend, and L. Sapir). irst International Conference on Algorithmic Applications in Manage- ment, Xi’an, Shaanxi, China June 22-24, 2005. Lecture Notes in Computer Science, 3521 (2005) 192-202. (156) Bacterial Population Assay via k-mer Analysis (with D. Papamichail, S. McCorkle, and D. vander Lelie) ThirdAsia-Pacific Bioinformatics Conference (APBC 2005) Singapore, January 17-21, 2005. Series on Advances in Bioinformatics and Computational Biology 1 (2005) 299-308. (157) Attention and Communication: Decision Scenarios for Teleoperating Robots (with J. Nicker- son) Proc. 38th IEEE Hawaii Int. Conf.System Sciences,Big Island Hawaii, January 3-6, 2005. (158) An ImprovedTime-Sensitive Metaheuristic Optimizer (with V.Phan). 3rdInternational Workshop on Experimental and Efficient Algorithms,(WEA 2004), Buzios, Brazil, May 25-28, 2004. (159) Heterogeneous Data Integration with the Consensus Clustering Formalism (with V.Filkov) nternational Workshop on Data Integration in the Life Sciences (DILS 2004), Leipzig, Ger- many, March 25-26, 2004. Lecture Notes in Computer Science, 2994 (2004) 110-123. (160) Parsing without a Grammar: Making Sense of Unknown File Formats, (with L. Lloyd). Third IEEE International Conference on Data Mining (ICDM’03) Melbourne FL, November 19-22, 2003. (161) A Model for Analyzing Black-Box Optimization, (with V.Phan and P.Sumazin). Workshop on Algorithms and Data Structures (WADS ’03). Springer-Verlag LNCS 2748 (2003) 13

424-438. Ottawa,Canada, July 30-August 1, 2003. (162) Sorting with Length-Weighted Reversals (with R. Pinter), Proc. 13th International Conference on Genome Informatics (GIW 2002), Universal Academic Press, 173-182, 2002. (163) A Time-Sensitive System for Black-Box Combinatorial Optimization, (with V.Phan and P. Sumazin). Workshop on Algorithm Engineering and Experimenation (ALENEX ’02). San Francisco CA, January 2002. (164) Methods for Analyzing Gene Expression Data overMultiple Data Sets, (with V.Filkov and J. Zhi). 2001 IEEE - EURASIP Workshop on Nonlinear Signal and Image Processing,Balti- more, MD, June 3-6, 2001. (165) Who is interested in algorithms and why?: lessons from the StonyBrook Algorithms Reposi- tory.Second Workshop on Algorithm Engineering,Saarbrucken, Germany, August 20-22, 1998. Also, ACMSIGACT News (September 1999) 65-74. (166) Probe Trees for Touching Character Recognition (with G. Sazaklis, E. Arkin, and J. Mitchell) Proc. International Conference on Imaging Science,Systems and Technology,(CISST) Las Ve gas, NV July 6-9, 1998, pp. 282-289. (167) Trie-Based Data Structures for Sequence Assembly (with T.Chen). Proc. Combinatorical Pattern Matching (CPM ’97).Springer-Verlag LNCS v.1264, pp. 206-223. Also, accidently accepted (but not submitted) to WADS ’99. (168) Efficiently Partitioning Arrays (with S. Khanna and M. Muthukrishnan). Proc. ICALP ’97, Springer-Verlag LNCS 1256, pp. 616--626, 1997. (169) Fabricating Arrays of Strings (with R. Bradley). FirstInternational Conf.onComputational Molecular Biology (RECOMB 97) pp. 57-66, January 20-23, 1997. (170) Optimizing Triangle Strips for Fast Rendering (with F.Evans and A. Varshney). Proc. IEEE Visualization ’96,319-326, San Francisco CA, October 1996. (171) Reconstructing Strings from Substrings in Rounds (with D. Margaritis). Proc. 36th IEEE Symp. Foundations of Computer Science (FOCS ’95), pp. 613-620, Milwaukee WI, October 23-25, 1995. (172) Point Probe Decision Trees for Geometric Concept Classes, (with E. Arkin, M. Goodrich, J. Mitchell, D. Mount, and C. Piatko), Workshop on Data Structures and Algorithms (WADS ’93), Springer-Verlag Lecture Notes in Computer Science 709 95-105. (173) Ranger: ATool for Nearest Neighbor Search in High Dimensions (with M. Murphy). Video reviewin Proc. Ninth ACM Symposium on Computational Geometry,403-404, San Diego, CA, May 1993. (174) Frequencies of Large Distances in Integer Lattices (with Venugopal Reddy). Proc. Seventh International Conference on Graph Theory,Combinatorics, Algorithms, and Applications, Kalamazoo MI, in Graph Theory,Combinatorics, and Applications,ed. Y.Alavi and A. Schwenk, John Wileyand Sons, NewYork, v.2,981-989. Also, Report 89-18, Department of Computer Science, State University of NewYork, StonyBrook, June 1989. (175) Inducing Codes from Examples (with Wai-Hong Leung). FirstIEEE Data Compression Con- ference,267-276, April 8-10, 1991, Snowbird Utah. Also, report 90-32, Department of Com- puter Science, State University of NewYork, StonyBrook, October 1990. (176) An ImprovedDeferred Data Structure for Membership Queries (with Anil Bhansali). 28th Annual Allerton Conference on Communication, Control, and Computing,914-923, October 3, 1990. Also, report 89-24, Department of Computer Science, State University of NewYork, StonyBrook, October 1989. (177) Reconstructing Sets from Interpoint Distances (with Warren Smith and Paul Lemke). in Sixth ACMSymposium on Computational Geometry,June 1990, 332-339. Full version in Discrete 14

and Computational Geometry: The Goodman-PollackFestschrift (Algorithms and Combina- torics, 25) Springer-Verlag 2003. (178) Compiler Optimization by Detecting Recursive Subprograms, in “Proceedings ACM ’85”, 403-411.

Books

(179) The Data Science Design Manual,Springer International Publishing AG, Switzerland, 2017. Japanese translation published by O’Reilly Japan, 2020. Russian translation to be published by Dialektika Publishing. Korean translation to be published by Agency-One. (180) Who’sBigger: WhereHistorical Figures Really Rank (with C. Ward), Cambridge University Press, NewYork, 2013. (181) The Algorithm Design Manual (second edition), Springer Verlag, London, August 2008. IS- BN number 978-1-84800-069-8. Chinese translation and reprint edition to be published by Tsinghua University Press, Beijing. Korean translation to be published by SciTech Media, Inc. Japanese translation published by Springer Japan KK. Russian translation published by BHV Publishing House, Russia. First edition published by Telos/Springer-Verlag, NewYork, November 1997. ISBN number 0-387-94860-0. Alternate Selection of The Library of Sci- ence,June 1998. SISE reprint edition (for Asian markets), 2004. (182) Programming Challenges: The Programming Contest Training Manual (with M. Revilla), Springer-Verlag, NewYork, 2003. ISBN Number 0-387-00163-8. Korean translation by Han- bit Media, Inc., 2004. Polish translation by WydawnictwaSzkolne i Pedagogiczne of Warsaw, 2004. Russian translation by Kudits-Obraz Publishing Company, 2005. Spanish translation by Univ. deValladolid Press, 2006. Chinese translation by Tsinghua University Press, 2009. (183) Computational Discrete Mathematics (with S. Pemmaraju), Cambridge University Press, New York, 2003. (184) Calculated Bets: Computers, Gambling,and Mathematical Modeling to Win,Cambridge Uni- versity Press, NewYork, 2001. ISBM number 0-521-80426-4. Electronic edition available through www.netLibrary.com (185) Implementing Discrete Mathematics: Combinatorics and Graph Theory in Mathematica,Ad- vanced Book Division, Addison-Wesley, Redwood City CA, June 1990. ISBN number 0-201-50943-1. Japanese translation published by Toppan, Tokyo, July 1992.

Publications in Books

(186) Network Embeddings (with H. Chen, B. Perozzi, and R. al-Rfou) in Social Media Analytics: Adviances and Applications,edited by J. Tang and C. Aggarwal. CRC Press, 2018 (187) Dominance Certificates for Combinatorial Optimization Problems, (with D. Berend and Y. Twitto). in Optimization Problems in Graph Theory,dev oted to the 60th birthday of professor Gregory Gutin. edited by B. Goldengorin. Springer Optimization and its Applications, 2018. (188) Assembly ForDouble-Ended Short-Read Sequencing Technologies (with J. Chen). ‘‘Ad- vances in Genome Sequencing Technology and Algorithms’’edited by E. Mardis, S. Kim, and H. Tang, Artech House Publishers, 123-141, 2007. (189) : Synthesis and Modification of a Chemical Called Poliovirus (with S. Mueller,D.Papamichail, J. Coleman, J. Cello, A. Paul, and E. Wimmer) in FutureTrends in Microelectronics: The Nano, the Giga, the Ultra, and the Bio,WileyInterscience, 2007. (190) Meta-Analysis of Microarray Data (with S. Pyne and B. Futcher) in Bioinformatics Algo- rithms: Techniques and Applications,ed. I. Mandoiu and A. Zelikovsky. WileyBook Series 15

on Bioinformatics: Computational Techniques and Engineering, 2008. 329-352. (191) MultiPrimer: Asystem for Microarray PCR Primer Design (with R. Fernandes), Chapter 15 of PCR Primer Design,ed. Anton Yuryev. Methods in Molecular Biology Series, Humana Press, 2007. (192) Optimizing RNASecondary Structure overAll Possible Encodings of a GivenProtein (with B. Cohen). RECOMB 2000 Poster Session and Currents in Computational Biology,ed. S. Miyano, R. Shamir,and T.Takagi, Universal Academy Press, 2000. (193) Geometric Reconstruction Problems. CRC Handbook of Discrete and Computational Geome- try,ed. J. E. Goodman and J. O’Rourke, CRC Press, 481-490, 1997. Second edition: 665-676, 2004. Third edition: 2016 (with Y.Disser). (194) Analyzing Integer Sequences (with Anil Bhansali). Computational Support for Discrete Mathematics,DIMACS Series in Discrete Mathematics and Theoretical Computer Science, volume 15, pp. 1-16. American Mathematical Society,1994. (195) Teaching Combinatorics and Graph Theory with Mathematica. Symbolic Computation in Un- dergraduate Mathematics Education,ed. ZavenKarian, MAA Notes, volume 24, 1992, 143-153. Also in Jounradas sobreNuevas Tecnologias en la Ensenanza de las Matematicas en la Universidad (TEMU-95), ed. A. Montes and J. Brunat, U.P.C., Barcelona, February 16-18, 1995, 413-424. (196) Computational Geometry. The Encyclopedia of Computer Science and Engineering,ed. An- thonyRalston and Edwin D. Reilly,third edition, Von Nostrand Reinhold, NewYork, 1992, 214-216, and fourth edition, Nature Publishing Group, 2000, 265-268.

Submitted for Publication

(197) A Data-DrivenPredictome for Cognative,Psychiatric, Medical, and Lifestyle Factors on the Brain (with F.Sultan and L. Mujica-Parodi) Submitted to NatureNeuroscience. (198) Hierarchies overVector Spaces: Orienting Word and Graph Embeddings (X. Guo and S. Skiena) Submitted to North American Association for Computational Linguistics (NAACL 2020), (199) Does it Pay to Optimize AUC? (B. Zhou and S. Skiena) Submitted to Thirty-fourth Confer- ence on Neural Information Processing Systems (NeurIPS 2020), Vancouver BC, December 5-12, 2020. (200) Mechanism of Gene Attenuation by Poor Codon Pair Bias (with J. Gardin, S. Honey, A. Yurovsky, S.Mueller,E.Wimmer,and B. Futcher). Submitted to Current Biology.

Theses

(201) Geometric Probing (Doctoral Dissertation), advisor: Herbert Edelsbrunner.Report UIUCD- CS-R-88-1425, Department of Computer Science, Urbana IL, April 1988. (202) Complexity of Optimization Problems on Solitaire Game Turing Machines (M.S. thesis), Uni- versity of Illinois, advisor: David Muller,May 1985. (203) A Navigation Algorithm for Simple Vision in a Plane (B.S. thesis), University of Virginia, ad- visor: Edward Howorka, May 1983.

Other Publications

(204) Embeddings-enhanced Language Communities Separation Selected for special issue of Ap- plied Network Science.Also, Eighth Int. Conf.ComplexNetworks and their Applications, 16

December 10-12, 2019, Lisbon Portugal. (205) Time WindowFrechet and Weighted Edit Distance for Passively Collected Trajectories (with J. Ding and J. Gao). 28th Fall Workshop on Computational Geometry.,Queens College, NY October 26-27, 2018. (206) Optimal Phase Balancing Using Dynamic Programming With Spatial Consideration (with K. Wang and T.Robertazzi) College of Engeering and Applied Sciences Technical Reports. http://hdl.handle.net/11401/77882 October 12, 2017. (207) Capturing Properties of Names with Distributed Representations (with S. Han, Y.Hu, B. Coskun, and M. Liu). Presented at Yahoo Techpluse 2016. (208) Citation Histories of Papers: Sometimes the Rich Get Richer,Sometimes theyDon’t(with M. Hazoglou, V.Kulkarni, and K. Dill). ArXiv, https://arxiv.org/abs/1703.04746, March 14, 2017. (209) Exact Age Prediction in Social Networks, (with B. Perozzi). Poster session, 24th Int. World Wide Web Conference (WWW 2015) Florence Italy,May 18-22, 2015. (210) GIS Technology supports Taxi Tip Prediction (with O. Starov), Map Gallery within Esri User Conference 2014, July 14-17, San Diego. http://www.esri.com/events/user-conference/activi- ties/map-gallery.Selected for Esri Map Book, 2015 User Conference, volume 30. (211) News-Based Group Modeling and Forecasting (with W.Zhang). arXiv:1405.2622, May 14, 2014. (212) Measuring moderately subtle dimensions of culture with medium-sized data (with A. van de Rijt and C. Ward) American Sociological Association, 109th Annual Meeting (invited presen- tation). August 16-19, 2014. (213) Who’sthe most significant historical figure?, (with C. Ward), The Guardian (U.K.), Books/Point of View, January 30, 2014. (214) Who’sBiggest? The 100 Most Significant Figures in History,(with C. Ward), Time.com, De- cember 10, 2013. (215) Recoding the L gene of vesicular stomatitis virus by computer-aided synthesis leads to unex- pected biological phenotypes. (with B. Wang, G. Tekes, C. Ward, S. Skiena, B. Futcher,S. Mueller,S.Whelan, and E. Wimmer). 15th Int. Negative Strand Virus meeting, Granada Spain, June 16-21, 2013. (216) Empath: ACorpus-Independent Gold Standard for Sentiment Analysis (with C. Ward, E. Xavier and Y.Choi). Eight Int. Conf. on Emerging Technologies for a Smarter World (CE- WIT2011). Hauppauge NY,November 2-3, 2011. (217) Encoding matters: different codon-pair encodings of the same amino acid sequence have dif- ferent levels of function. (with S. Yan, S. Honey*, C. Ward, S. Mueller,E.Wimmer and B. Futcher) 13th Yeast Cell Biology Meeting, Cold Spring Harbor Laboratory,August 2011. (218) Large Scale Online TextAnalysis Using Lydia (with E. Key,L.Huddy,and M. Lebo). Annual Meeting of the American Political Science Association, Washington, DC September 2010. (219) Genetically Re-engineered AAVRep78 Identifies Sequence-Restricted Inhibitory Regions Af- fecting Adenoviral Replication (with V.Sitaraman, P.Hearing, S. Mueller,C.Ward, E. Wim- mer,and W.Bahou) American Society of Gene and Cell Therapy13th annual meeting, Wash- ington DC, May 19-22, 2010. (220) Lydia: News/Blog Analysis (with M. Bautin and C. Ward). Second Hadoop Summit,Santa Clara CA, June 10, 2009.‘ (221) Designing the World’sShortest Gene (with V. Patsalo, D. Papamichail, and D. Green). Syn- thetic Biology 4.0,Hong Kong, October 10-12, 2008. 17

(222) Design and Synthesis of Minimal and Persistent Protein Complexes(with V.Patsalo, D. Pa- pamichail, and D. Green). Microsoft eScience Workshop, Chapel Hill NC, October 21-23 2007 (223) Large-Scale Sentiment Analysis for News and Blogs (Demonstration) (with N. Godbole and M. Srinivasaiah). Int. Conf.onWeblogs and Social Media (ICWSM 2007), DenverCO, March 26-28, 2007. (224) N-235 Population Dynamics in the Rhizosphere: HowElevented CO2 Transforms Microbial Community Composition. (with C. Lesaulrier,D.Papamichail, S. McCorkle, B. Zak, B. Ol- livier,S.Taghavi, and D. van der Lelie). Poster Session ASM ’06, June 2006. (225) N-158 Phylogenetic Identification of the Rhizosphere of Poplar: Effects of Elevated CO2 on Microbial Community Composition (with C. Lesaulrier,D.Papamichail, S. McCorkle, B. Ol- livier,S.Taghavi, and D. van der Lelie). Poster Session ASM ’05, June 2005. (226) Meta-analysis Based on False Discovery Rate, (with S. Pyne and B. Futcher), Poster Session RECOMB ’05, Boston MA, May 15-18, 2005 (227) Combinatorica: ASystem for Exploring Combinatorics and Graph Theory in Mathematica, (with S. Pemmaraju), Graph Theory Notes of NewYork,GTD 47. (228) Alphabet Permutation for Differentially Encoding Text(with G. Landau and O. Levi). 11th Symp. of String Processing and Information Retrieval (SPIRE ’04), Springer-Verlag LNCS 3246, 216-7. (229) Resolving Ambiguity in Overloaded Keyboards (with F.Evans and A. Varshney), Godfried Toussaint Festscrift,http://cs.smith.edu/˜orourke/GodFest/ 2004. (230) Sequence Assembly for Single Molecule Methods (with A. Smirnov). ThirdRECOMB Satel- lite Meeting on DNASequencing Technologies and Computation,Stanford CA, May 17-18, 2003. (231) Coloring Graphs with a General Heuristic Search Engine, (with V.Phan). Computational Symposium: Graph Coloring and Generalizations,Cornell University,Ithaca NY,Sept. 8, 2002. (232) Evidence of Natural Selection in Synonymous RNACoding Spaces (with B. Cohen), Mathe- matical Formalisms for RNAStructureWorkshop,poster session, Montreal, CA, May 26-27, 2001. (233) On the Manufacturability of Paperclips and Sheet Metal Structures, with E. Arkin, S. Fekete, and J. Mitchell), 17th European Workshop on Computational Geometry,pp. 187-190, Berlin, Germany, March 26-28, 2001. (234) Some Lower Bounds on Geometric Separability Problems, (with E.M. Arkin, F.Hurtado, J.S.B. Mitchell, and C. Seara), Proc. II Jornadas de Matematica discreta y algoritmica,11-15, Mallorca, Spain, September 11-12, 2000. (235) Some Separability Problems in the Plane (with E. Arkin, F.Hurtado, J. Mitchell, and C. Seara). 16th European Workshop on Computational Geometry,51-54, Eliat, Israel, March 13-15, 2000. (236) A NewGraph Representation for Combinatorica (with J. Trias). Technical Report MAII- IR-99-0032, Universitat Politechica de Catalunya, Barcelona Spain, December 1999. (237) HowGood is Genome-LevelFragment Assembly? (with T.Chen). (poster session) RECOMB ’98, NewYork, March 1998. (238) Special Issue: Selected papers from the ARO/MSI StonyBrook workshop on Computational Geometry,Guest Editors Forward (with E. Arkin and J. Mitchell) Int. J.Comp. Geometry and Applications,7(1997) 1-3. 18

(239) Enabling Virtual Reality for Large-Scale Mechanical CAD Datasets (with A. Varshney, J.El- Sana, F.Evans, L. Darsa, and B. Costa). Proc. 1997 ASME Design Engineering Technical Conferences (DETC’97), Sacremento CA, September 14-17, 1997. (240) Geometric Decision Trees for Optical Character Recognition (with G. Sazaklis, E. Arkin, and J. Mitchell). 13th ACM Symp. Computational Geometry,short communications, applied track. June 1997. (241) Stripe: ASoftware Tool for Efficient Triangle Strips (with F.Evans and A. Varshney). SIG- GRAPH Technical Sketches.Visual Proceedings, Computer Graphics Annual Conference Se- ries/1996, p. 153 (242) Computational Support for Primer Walking (with T.Chen), University of Pennsylvania Com- putational Biology Conference (poster session), Princeton University,May 17-19, 1996. (243) Randomized Algorithms for Identifying Minimal Lottery Ticket Sets (with F.Younas), Jour- nal of Undergraduate Research, 2-2 (1996) 88-97. (244) Length Ratios for Discrete Distance Metrics (with S. Balakrishnan and C. Xu), Report 95-8, Department of Computer Science, State University of NewYork, StonyBrook, June 1995. (245) LINK: ACombinatorics and Graph Theory Workbench for Applications and Research (with J. Berry,N.Dean, P.Fasel, M. Goldberg, E. Johnson, J. MacCuish, and G. Shannon). DIMACS Technical Report 95-15, Rutgers University,May 1995. (246) Inducing Better Codes from Examples (with Yaw-Ling Lin). ThirdIEEE Data Compression Conference (poster session), Snowbird UT,April 1993. (247) Second MSI StonyBrook Workshop on Computational Geometry Final Report, (with Joe Mitchell), Department of Applied Mathematics, State University of NewYork, StonyBrook, November 1992. (248) MSI StonyBrook Workshop on Computational Geometry Final Report, (with Joe Mitchell), SUNYSB-AMS-91-14, Department of Applied Mathematics, State University of NewYork, StonyBrook, November 1991. (249) Combinatorial Mathematica. Report 90-24, Department of Computer Science, State Universi- ty of NewYork, StonyBrook, July 1990. (250) Mechanizing Mathematics Unix Review (December 1989) 7-12,66-71. (251) Academic Computing in the Year 2000 (with LukeYoung, Kurt Thearling, Arch Robison, Stephen Omohundro, Bartlett Mel, and Stephen Wolfram). Academic Computing (May/June 1988) 7-12, 62-65. (252) An OverviewofMachine Learning in Computer Chess, International Computer Chess Associ- ation Journal 9 (1986), 20-28. (253) Honeywell Futurist Award Essay FUTURICS 9-4 (1985), 20-21. (254) An Evaluation of Compression Algorithms for Weather Radar Images: Preliminary Report, Report 47PM-WX-0007, MIT Lincoln Laboratories, Lexington MA, 1984.

Patents

(1) Name-Based Classification of Electronic Account Users (with J. Ye, Y.Hu, B. Coskun, and M. Liu). Filed: January 27, 2017. U.S. Patent 10,636,048. Issued April 28, 2020. (2) Attenuated Influenza Viruses and Vaccines (with S. Mueller,E.Wimmer,B.Futcher,and C. Yang). Filed: March 15, 2014. U.S. Patent 10,316,294. Issued June 11, 2019. (3) Methods of Making Modified Viral Genomes (with E. Wimmer,S.Mueller,B.Futcher,D.Pa- pamichail, J. Coleman, and J. Cello). Filed: September 7, 2016. U.S. Patent 10,023,845. Is- sued July 17, 2018. 19

(4) Apparatus and Method for Optimal Phase Balancing Using Dynamic Programming with Spa- tial Consideration (with T.Roberazzi and K. Wang). Filed: December 10, 2013. U.S. Patent 9,728,971. Issued August 8, 2017. (5) Attenuated Viruses Userful for Vaccines (with E. Wimmer,S.Mueller,B.Futcher,D.Pa- pamichail, J. Coleman, and J. Cello). Filed: March 31, 2008. U.S. Patent 9,476,032. Issued October 25, 2016. (6) Methods and Systems for Determining Media Value (with GregArtzt, Mark Fasciano, and LevonLloyd). Filed: March 14, 2011. U.S. Patent 8,402,035. Issued March 19, 2013. (7) Large-Scale Sentiment Analysis (with Namrata Godbole and Manjunath Srinivasaiah). Filed April 24, 2007. U.S. Patent 7,996,210. Issued August 9, 2011. Supplemental patent U.S. 8,515,739 issued August 20, 2013. (8) System and Method for Entering TextinaVirtual Environment, (with Francine Evans and Amitabh Varshney). Filed July 30, 1999. U.S. Patent 6,407,679. Issued June 18, 2002. (9) Sentence Reconstruction using Word Ambiguity Resolution, (with Harald Rau). Filed June 30, 1995. U.S. Patent 5,828,991. Issued October 27, 1998. Supplemental patent U.S. 5,960,385 issued September 28, 1999. Supplemental patent U.S. 7,440,889 issued October 21, 2008. (10) A Method of Identifying Sequence in a Nucleic Acid Target Using Interactive Sequencing by Hybridization. U.S. Patent 5,683,881. Issued November 4, 1997.

Grants

(1) StonyBrook Research Development Grant, “The Act of Counting”, 2/88-2/89, $3,500. Stony Brook Research Development TravelGrant, April 1990, $550. (2) Addison-WesleyManuscript Preparation Grant, “The Act of Counting”, 7/89, $6,700. (3) Wolfram Research Inc. Development Grant, “The Act of Counting”, 1/90-12/90, $28,600. (4) CISE Research Instrumentation Grant, “Research Equipment for Parallel Scientific Comput- ing” (with Y.Deng, J. Glimm, J. Grove,A.Kaufman, R. Tew arson, and L. Wittie) $195,822. (5) NewYork Science and Technology Foundation, “Data Compression for High-Density Bar- codes”, 6/1/91-6/1/92, $40,000. (6) NewYork Science and Technology Foundation, “Interactive Autostereoscopic Visualization” (with Ari Kaufman), 4/1/91-3/31/92, $40,000. Dimension Technologies Inc. Research Grant, “Stereoscopic Display and Mathematica”, 6/90, $2,500 (7) NSF Research Initiation Award, “Algorithms for Combinatorial Computing Environments”, 6/15/91-5/31/93, $37,137. (8) NSF CCR/CISE Grant with Infrastructure, “Combinatorial Computing Environments and Ex- perimental Discrete Mathematics” (with Nate Dean, Mark Goldberg, Daniel Gorenstein, and GregShannon), 6/1/93-5/30/96, $347,205 (StonyBrook subcontract $30,000). (9) ONR Young Investigator Program, “Environments for Combinatorial Computing”, 5/1/93-4/31/96, $225,000. Matching funds from NavalResearch Laboratory,6/1/94-5/30-95, $30,000. (10) NSF ILI Program, “Teaching with Combinatorica and Semantica” (with R. Larson and D. Warren), 6/15/93-5/31/95, $51,027. (11) International Information Science Foundation, Visiting Researcher Support Program, Spring/Summer 1993, 300,000 Yen. (12) Center for Advanced Manufacting - SUNY StonyBrook, “Geometric Methods for Optical Character Recognition”, (with E. Arkin and J. Mitchell), 9/1/93-5/31/94, $14,500. 20

(13) Symbol Technology,“Development of PDF-417 Footnotes”, (with T.Pavlidis), 9/1/93-8/31/94, $25,000. (14) National Science Foundation, “Interactive Sequencing by Hybridization”, 8/1/96-7/31/00, $210,000. (15) Office of NavalResearch, “Restricted Cycle Problems with Applications“, 4/1/97-9/30/99, $230,656. (16) BrookhavenNational Laboratory,“Computational Support for Primer Walking”, 5/15/96-12/15/96, $12,851. Renewed, 12/15/96-9/30, $14,769. (17) Syngen Corporation and SPIR, “Efficient and Robust Methods for Optical Character Recogni- tion”, (with E. Arkin and J. Mitchell), 6/1/96-12/15/96, $20,000. (18) US-Spain Joint Grant, “Research in Computational Geometry and Combinatorial Algorithms: Collaborations between StonyBrook and UPC” (with E. Arkin, F.Hurtado, J. Mitchell and V. Sacristan), 5/1/98-4/30/99, $30,000. (19) Center for Biotechnology,“Software for Gene Regulatory Analysis” (with J. Konopka), 7/1/99-6/15/00, $40,000. (20) Office of NavalResearch, “Heuristic Approaches to Optimization with Applications”, 1/1/00-9/30/02, $211,256. (21) “Algorithm Engineering for NP-Complete Problems”, National Science Foundation, 9/1/00-8/30/03, $210,000. (22) Syncytium Inc. and SPIR, “Gene Expression Profiling”, 1/1/01-8/31/01, $20,000. (23) “Gene Design for Vaccines and Therapeutic Phages”, (with E. Wimmer), National Science Foundation ITR, 10/1/03-9/30/07, $793,623. REU supplement, summer 2004, $6,000 (24) “Computational Analysis of Genomic Sequence Tags”, BNL/StonyBrook seed grant, 6/1/03-5/31/04, $25,000. (25) NewApproaches for Synthesizing Spotted Microarrays Center for Biotechnology ITD pro- gram, 7/1/03-6/15/04, $30,000. (26) Spotted Microarray Design / Analysis for Human Platelet Disorders, (with W.Bahou) Target- ed Research Opportunity,School of Medicine, 9/1/03-2/28/05, $25,000. (27) Agents of Bioterrorism: Pathogenesis and Host Defense, (PI: J. Benach, Microarray and Bioinformatics Core with B. Futcher), NIAID/NIH, 7/1/04-6/30/09, $14,492,451. subcon- tract, $46,319. (28) BrookhavenNational Laboratory,“Bacterial Population Assay via k-mer Analysis”, 10/1/04-9/30/05, $31,009. Renewal10/1/2005-9/30/2007, $63,582. (29) Sequence Assembly for High-Throughput Technologies, National Science Foundation, 7/01/05-6/30/08, $330,018. (30) The Newspaper of the Future: Personalized News Selection and Delivery (with H. Schneider), Cewit seed funding, 1/1/07-12/30/07, $9,000. (31) Linguistically-Informed Question Answering (with A. Stent), Cewit seed funding, 1/1/07-12/30/07, $9,000. (32) Design and Synthesis of Minimal and Persistent Protein Complexes(with D. Green) Mi- crosoft Research, 6/1/07-5/31/08, $90,000 (33) Coding News Media and Blog Content in the 2008 Presidential Election Cycle (with L. Hud- dy and M. Lebo), AnnenbergElection Study,6/1/07-12/31/09, $126,859 (34) Modeling Information Spread through Large-Scale News and Blog Analysis, National Geospatial-Intelligence Agency, 7/1/07-6/30/10, $240,000. 21

(35) PLATO:Phased Learning through Analyzing Teaching and Observation, DARPA, 9/1/07-12/31/08, $125,000 (subcontract from SRI) (36) Entity-Oriented Search and Analysis, Google, 9/01/07-8/31/08, $50,000. (37) Synthetic Viral Genome Design for Rapid Vaccine Development (with E. Wimmer and B. Futcher), NIH 5R01AI07521903, 4/1/08-3/31/13, $2,784,141. (38) Synthetic Viral Genome Design for Rapid Vaccine Development (with S. Mueller), FUSION seed Grant, School of Medicine, StonyBrook University,11/1/07-10/30/09, $80,000. (39) Better Sentiment Analysis through Forecasting, NSF,9/1/10-8/30/13. $407,164. REU supple- ment, summer 2011, $16,000. (40) Synthetic Sequence Designs for Real Biology (with B. Futcher), NSF,3/1/11-2/28/14. $497,915. (41) Who’sBigger? Ranking Historical Figures through a Telephone Survey (with A. van de Rijt) Seed Grants for Survey Research (SGSR), StonyBrook University,6/1/12-3/1/13. $10,000. (42) Assembly algorithms for single-molecule DNAsequencing, (with M. Schatz) SBU-CSHL Seed Grants, 9/1/12-8/31/13. $30,000. (43) Gene Design Modulating Secondary Structure, Planet Biotechnology,10/1/12-9/30/13. $2,000. (44) Measuring Language Shifts by Place, Time, and Community,Google, 1/1/14-12/30/14. $30,000 (45) ABI Innovation: Sequence Optimization for Synthetic Biology (with B. Futcher), NSF, 6/1/14-5/31/19. $499,881. (46) BIGDAT A:F:DeepWalking Graphs for Feature Extraction NSF,IIS-1546113 1/1/16-12/30/20. $731,260. (47) Encoding Biases and Their Effects on Translation, Quality Control, and Gene Expression NIH, co-PI (with B. Futcher). 1R01GM119175, 08/01/16-4/30/2020. $1,249,889. (48) Trainspotting: Improving Energy EfficiencyinFreight Transportation, PowerBridge New York, 5/1/17-11/30/18. $150,000. (49) Protecting the Aging Brain: Self-Organizing Networks and Multi-Scale Dynamics Under En- ergy Constraints, W.M. Keck Foundation, co-PI (with L. Mujica-Parodi and K. Dill). 9/1/17-8/30/20. $1,026,431 ($106,130 to Computer Science). (50) MRI: Acquisition of Heterogeneous Computer System for Machine Learning (with M. Ferd- man, A. Kaufman, Z. Liu, and D. Samaras), NSF 10/1/19-9/30/22. $571,975. (51) EAGER: AI-DCL: Measuring and Mitigating Animosity Tow ard AI Systems and Science (with J. Jones), NSF,8/15/19-7/31/21. $299,712. (52) Large-Scale Graph Embedding, Yahoo Faculty Research and Engagement Program (FREP), Verizon Media, 9/1/19-8/31/20. $25,000. (53) NCS-FR: Protecting the Aging Brain: Self-Organizing Networks and Multi-Scale Dynamics under Energy Constraints, NSF (with L. Mujica-Parodi, E. Ratal, N. Smith, and K. Dill) 8/1/19-7/30/22. $4,995,027. (54) Knowledge Graph Embedding Evolution for COVID-19, Northeast Big Data Innovation Hub. (with B. Zhou) 10/1/20-9/1/2021. $24,992. (55) Center for AI Approaches in Aging and Alzheimer’sDisease (CAiAD), NIH P30 Center (pending), Lead PI (with S. Clouston and M. Sano) 7/1/21-6/30/26. $19,978,130. (56) RI: Small: NLP for Book-Length Texts (pending) NSF,5/1/21-4/30/24. $499,995. 22

(57) AI Institute: Models of Me: Unlocking Human Potential through AI (pending), Lead PI (with D. Samaras, M. Ireland, K. Kitani, and J. Wobbrock) 9/1/21-8/30/26. $20,000,000.

Presentations

(1) Word and Graph Embeddings for Machine Learning (invited lecture) 7th International School on Big Data (BigDat 2021 Spring) Ben-Gurion University of the Negev, Beersheba, Israel April 18-22, 2021. (2) TBD, Center for Mathematical Sciences and Applications (CSMA), Harvard University,Cam- bridge MA, March 24, 2021. (3) Word and Graph Embeddings for Machine Learning, NewJerseyInstitute of Technology, Newark NJ, January 27, 2021. (4) Word and Graph Embeddings for Machine Learning Faculty Connections Lecture Series StonyBrook University,November 10, 2020. (5) Ask Me Anything, Cambridge University Competitive Programming Society,Cambridge Uni- versity,United Kingdom (via Zoom), September 2, 2020. (6) Machine Learning on Network Data SigmaCamp 2020, Silver LakeCamp and Conference Center Sharon CT,(via Zoom) August 21, 2020. (7) Word and Graph Embeddings for Machine Learning FREP talk, Yahoo Research, June 2, 2020. (8) Data Science, Computer Science and Real Science, SigmaCamp reviewlecture, May 19, 2020. (9) Can YouTrust Your Alexa? Ansche Chesed Salon (with Steve Bellovin), NewYork NY,April 12, 2020 (10) Interpreting Word and Graph Embeddings, Interpretable AI: Across the Disciplines, SUNY Global Center,New York NY,April 3, 2020. (postposed, COVID) (11) Word and Graph Embeddings for Machine Learning, SRI International, Princeton, NJ, March 20, 2020. (12) Representing Knowledge Through Word and Graph Embeddings, STEM Speaker Series, Melville Library,February 11, 2020. (13) TEDx: The Augmented Age (panelist), The Spur,Southampton NY January 28, 2020. (14) Word and Graph Embeddings for Machine Learning, Yahoo Labs NewYork NY,January 24, 2020. (15) AI Executive Breakfast (featured speaker), Center for Corporate Education, StonyBrook Uni- versity,November 20, 2019. (16) Word and Graph Embeddings for Machine Learning, (distinguished lecturer) Department of Computer Science, University of Virginia, Charlottesville VA, November 8, 2019. (17) Word and Graph Embeddings for Machine Learning Models, CEWIT 2019 Conference (invit- ed speaker), StonyBrook NY,November 6, 2019. (18) Data Science, Computer Science and Real Science, SigmaCamp 2019, Silver LakeCamp and Conference Center Sharon CT,July 14, 2019. (19) Natural Language Processing and News Analysis, Fair Media Council, Annual Summer Boot Camp, Oceanside NY,July 26, 2019. (20) Artificial Intelligence: HowAIisChanging our World, Patchogue-Medford Library, Patchogue NY,March 5, 2019. 23

(21) Word and Graph Embeddings with Applications, College of Business Research Seminar, StonyBrook NY March 1, 2019. (22) AI panel, Long Island Association, Melville NY,February 5, 2019. (23) Word and Graph Embeddings with Applications, Geomatric Analysis Approach to AI Work- shop, Harvard University,Cambridge MA. January 19, 2019. (24) AI PolicyForum (moderator), Rockefeller Inst. of Government Albany, NY, November 28, 2018. (25) Ethical AI? The Design of Moral Machines (invited presentation) Global Forum, Institute of Global Studies, Wang Center Theatre, StonyBrook, NY November 7, 2018. (26) Turning Graphs into Features via Network Embeddings (invited talk) GraphConnect 2018, NewYork, NY,September 20, 2018. (27) Applications of Word Embeddings, Department of Computer Science, The College of New Jersey, Ewing NJ, February 3, 2017. (28) Who’sBigger? A Quantitative Analysis of Historical Fame, Mathematical BiographyConfer- ence (invited speaker), British Society for the History of Mathematics, St. Andrews Universi- ty,Scotland. September 23-24, 2016. (29) Organizing the World, via Graph Embeddings and TSPs, Graph Theory Day 71 (invited speaker), Nassau Community College, Garden City,May 7, 2016. (30) DeepBrowse: An Interface for Aimless Browsing, Yahoo Research, NewYork NY.April 14, 2016. (31) Designing Viral Vaccines, RECOMB Satellite Conference on Bioinformatics Education: RE- COMB-BE (invited speaker), Howard Hughes Medical Institute (HHMI), Chevy Chase MD, November 13-15,2015 (32) Ask Me Anything, Hackerrank World Cup broadcast (invited presentation), September 9, 2015. (33) Who’sBigger? A Quantitative Analysis of Historical Fame, Yahoo! Research (invited speak- er: Big Thinker series), Sunnyvale CA, January 16, 2015. (34) Optimizing the Design of Coding Sequences, Stringology 2015 (invited speaker), Dead Sea, Israel, January 5, 2015. (35) Applications of Word Embeddings, Google Research, NewYork, October 29, 2014. (36) Who’sBigger? A Quantitative Analysis of Historical Fame, Yahoo! Research, NewYork NY, September 29, 2014. (37) Optimizing the Design of Coding Sequences, Ecology and Evolution Seminar,StonyBrook University,September 9, 2014. (38) Who’sBigger? Where Historical Figures Really Rank, (invited speaker) University of Vir- ginia-Wise, September 2, 2014. (39) Who’sBigger? Where Historical Figures Really Rank, Institute for Advanced Computational Science, StonyBrook University,March 7, 2014. (40) Who’sBigger? Where Historical Figures Really Rank, NewYork Public Library,Mid-Man- hattan Branch NewYork NY,January 23, 2014. (41) Algorithms in real life applications, Quafqaz University,Baku Azerbaijan, October 25, 2013. (42) Word Embeddings for all the World’sLanguages Seventh Int. Conf. Application of Informa- tion and Communication Technologies (AICT 2013), (keynote speaker) Baku State University, Baku Azerbaijan, October 23, 2013 24

(43) Who’sBigger? A Quantitative Analysis of Historical Fame, Microsoft Research, NewYork NY,August 15, 2013 (44) Redesigning Viral Genomes, Dept. of Computer Science Univ. ofHelsinki, Helsinki Finland. March 22, 2013. (45) My Vision for the Simons Center for Data Analysis, Simons Foundation, NewYork NY. February 20, 2013. (46) Who’sBigger? A Quantitative Analysis of Historical Fame, Wolfram Data Summit (invited talk), Mountain ViewCA. September 6, 2012. (47) Who’sBigger? A Quantitative Analysis of Historical Fame, Google Tech Talk, Mountain View CA. June 7, 2012. (48) Redesigning Viral Genomes, Agilent Technologies, Santa Clara CA, June 5, 2012. (49) Who’sBigger? A Quantitative Analysis of Historical Fame, Facebook, Palo Alto CA. June 4, 2012. (50) Redesigning Viral Genomes, BrookhavenNational Laboratory,Upton NY,May 11, 2012. (51) Large-Scale TextAnalysis with Lydia, IUCRC CDDAWorkshop, CEWIT,May 9, 2012 (52) Who’sBigger? A Quantitative Analysis of Historical Fame, TwoSigma Investments, New York, NY.March 21, 2012. (53) Synthetic Designs for Real Biology,Cold Spring Harbor Laboratory,Cold Spring Harbor NY November 16, 2011. (54) Being aProfessor at StonyBrook University,Mrs. Rosner’s2nd Grade Class, Nassakeag Ele- mentary School, Setauket NY,June 9, 2011. (55) Synthetic Designs for Real Biology,Viruses Forever--Wimmer Fest Symposium (invited speaker) StonyBrook NY,May 27, 2011. (56) Synthetic Designs for Real Biology,Stringology 2011, (invited speaker) Haifa, Israel, April 4, 2011. (57) News/Blog Analysis with Lydia, Rennsselear Polytechnic Institute (RPI), Troy, New York, March 3, 2011. (58) Synthetic Designs for Real Biology,Plum Island Animal Disease Center,Plum Island, NY. August 12, 2010. (59) News/Blog Analysis for the Social Sciences, Emeriti Faculty Colloquium, StonyBrook Uni- versity,March 5, 2010. (60) Genome Sequence Assembly and Synthetic Design: Better Reading and Writing through Al- gorithmetic, Laufer Center Seminar Series, StonyBrook University,December 1, 2009. (61) All IKnowisWhat I Read in the Newspaper: News/Blog Analysis for the Social Sciences, Computer Science Departmental Colloquium (inaugural), CEWIT,StonyBrook University, October 2, 2009. (62) Improving Movie Gross Prediction Through News Analysis, IEEE/ACM Int. Conf.Web Intel- ligence and Intelligent Agent Technology (WI 2009), Milan Italy,September 18, 2009. (63) Identifying Differences in News Coverage Between Cultural/Ethnic Groups, News Analysis Workshop of IEEE/ACM Int. Conf.Web Intelligence and Intelligent Agent Technology (WI 2009), Milan Italy,September 15, 2009. (64) News and Blog Analysis with Lydia, Seoul National University Seoul, Korea, July 17, 2009. (65) Designing Useful Viruses, Seoul National University Seoul, Korea, July 16, 2009. (66) News and Blog Analysis with Lydia, Shenzhen Inst. of Advanced Technology,Chinese Acad- emy of Sciences, Shenzhen, China, June 2, 2009. 25

(67) Designing Useful Viruses, Dept. of Computer Science, Hong Kong University,May 8, 2009. (68) Designing Useful Viruses, Shanghai Institutes for Biological Sciences (SIBS) Chinese Acade- my of Sciences (CAS) Shanghai, China, May 6, 2009. (69) Designing Useful Viruses, Dept. of Computer Science, Fudan University,Shanghai, China. May 5, 2009. (70) Calculated Bets: Computers, Gambling, and Mathematical Modeling to Win, 20th anniversary program, Open University of Hong Kong, March 5, 2009. (71) Designing Useful Viruses, Dept. of Computer Science and Engineering, Hong Kong Univ. of Science and Technology,February 20, 2009. (72) Crystalizing Short Read Assemblies Around Seeds, Seventh Asian-Pacific Bioinformatics Conference, Beijing China, January 14, 2009. (73) Assembly for Double-End Short-Read Technologies, National Taiwan University,Taipei Tai- wan, December 12, 2008. (74) News and Blog Analysis with Lydia, Academia Sinica, Taipei Taiwan, December 11, 2008. (75) Designing Useful Viruses, Tsinghua University,Hsinhu Taiwan, December 10, 2008. (76) Assembly for Double-End Short-Read Technologies, International Workshop on Algorithms and Computing Technologies, Providence University,Taichung Taiwan. (invited speaker) De- cember 8, 2008. (77) News and Blog Analysis with Lydia, CRA Chairs Conference (invited panel). Snowbird UT, July 14, 2008. (78) News and Blog Analysis with Lydia, Computer Associates (CA, Inc), Islandia NY,March 18, 2008. (79) Calculated Bets: Computers, Gambling,and Mathematical Modeling to Win O’Reilly Mon- ey:Tech Conference (invited speaker), NewYork, NY,February 6-7, 2008. (80) Assembly for Short Read Sequencing, (invited speaker) Center for Research in Genomics, Barcelona Spain, January 10, 2008. (81) Assembly for Short Read Sequencing, American Museum of Natural History,New York, NY, December 7, 2007. (82) News and Blog Analysis with Lydia, Polytechnic University,Brooklyn NY,November 30, 2007. (83) Assembly ForShort-Read Sequencing Technologies, Courant Institute, NewYork University, November 12, 2007. (84) Designing Useful Viruses, 13th International meeting on DNABased Computers(DNA13), (invited speaker) Memphis, TN, June 4-7, 2007. (85) News and Blog Analysis with Lydia, TwoSigma Investments NewYork NY,April 27, 2007. (86) Assembly ForShort-Read Sequencing Technologies, RECOMB session on newgeneration se- quencing technologies, Oakland CA April 23, 2007. (87) News and Blog Analysis with Lydia, Monitor110, NewYork NY,April 16, 2007. (88) Large-Scale Sentiment Analysis for News and Blogs, Int. Conf. Weblogs and Social Media, Boulder CO, March 27, 2007. (89) News and Blog Analysis with Lydia, Student ACM chapter,StonyBrook University,Stony Brook NY,Feburary 21, 2007. (90) News and Blog Analysis with Lydia, Dept. of Political Science, University of Pennsylvania, Philadelphia PA, Feburary 16, 2007. 26

(91) Progress in Assembly For Short-Read Sequencing Technologies, Helicos BioSciences Corpo- ration, Cambridge MA, November 13, 2006. (92) News and Blog Analysis with Lydia, V Jornadas de Matema’tica Discreta y Algori’tmica Campus de Soria, Universidad de Valladolid (invited speaker) Soria, Spain, July 11, 2006. (93) News and Blog Analysis with Lydia, Combinatorial Pattern Matching (CPM 2006) (invited speaker), Barcelona Spain, July 5-7, 2006 (94) News and Blog Analysis with Lydia, Portland State University (invited speaker), Portland OR, June 5, 2006 (95) Exploiting Redundancyinthe Genetic Code, Synthetic Biology Retreat (co-organizer), Childs Mansion, SUNY StonyBrook May 26, 2006. (96) News and Blog Analysis with Lydia, Google Research Seminar,New York NY,February 27, 2006. (97) Assembly ForShort-Read Sequencing Technologies, 454 Corporation, Branford CT.Decem- ber 17, 2005. (98) Assembly ForShort-Read Sequencing Technologies, Helicos BioSciences Corporation, Cam- bridge MA, December 16, 2005. (99) Lydia: Knowledge Extraction from Curated TextSources, Stringology Research Workshop of the Israeli Science Foundation (invited speaker) CRI, University of Haifa, Israel April 3-8, 2005 (100) Lydia: Knowledge Extraction from Curated TextSources, Renaissance Technologies, Stony Brook, NY,March 31, 2005. (101) Calculated Bets: Computers, Gambling, and Mathematical Modeling to Win, Student ACM Chapter,Dowling College, Oakdale NY,March 17, 2005. (102) Programming Contests, Algorithms, and the Real World, TopCoder Collegiate Challenge Fi- nals (invited speaker), Santa Clara CA, March 10, 2005. (103) Lydia: Knowledge Extraction from Curated TextSources, Center for Emerging Wireless and Information Technology (CEWIT) Conference, Crest HollowCountry Club, Woodbury NY, November 9, 2004. (104) Calculated Bets: Computers, Gambling, and Mathematical Modeling to Win, Capital Manage- ment Group, First Citizens Bank, North Ridge Country Club, Raleigh NC. November 5, 2004. (105) Calculated Bets: Computers, Gambling, and Mathematical Modeling to Win, 28th Atlantic Provinces Council on the Sciences (APICS) conference in Mathematics, Statistics, and Com- puter Science (keynote speaker) University of NewBrunswick, St. John, Canada. October 16, 2004. (106) Computational Graph Theory with Combinatorica, Graph Theory Day 47 (invited speaker), Half HollowHills High School, Dix Hills, NewYork May 8, 2004. (107) Computational Graph Theory with Combinatorica Mathematics Department, Pace University, NewYork NY,March 8, 2004. (108) Sequence Assembly for High-Throughput Technologies Computer Science Department, New JerseyInstitute of Technology,New ark NJ, March 1, 2004 (109) Calculated Bets: Computers, Gambling, and Mathematical Modeling to Win, Adelphi Univer- sity,Garden City LI, November 24, 2003 (110) Exploiting Redundancyinthe Genetic Code, University of Memphis, Memphis TN, Novem- ber 6, 2003. 27

(111) Computational Graph Theory with Combinatorica and Calculated Bets: Computers, Gam- bling, and Mathematical Modeling to Win, University of the South, Sewanee TN, November 5, 2003. (112) Computational Graph Theory with Combinatorica, Graph Theory Day 46 (invited speaker) NewYork Academy of Sciences, Manhattan, postponed from November 1, 2003. (113) Exploiting Redundancyinthe Genetic Code, Ongoing Research Seminar,Dept. of Computer Science SUNY StonyBrook, September 12, 2003 (114) Calculated Bets: Computers, Gambling, and Mathematical Modeling to Win, Barra’s27th An- nual Research Seminar (invited dinner speaker), Pebble Beach CA, June 10, 2003 (115) Calculated Bets: Computers, Gambling, and Mathematical Modeling to Win, CSAM Seminar Series, Montclair State University,Montclair,NJ, February 6, 2003 (116) Exploiting Redundancyinthe Genetic Code, Florida International University,Miami, Florida, January 8, 2003. (117) Exploiting Redundancyinthe Genetic Code, Cold Spring Harbor Laboratory,Cold Spring Harbor,LI, October 2, 2002. (118) Exploiting Redundancyinthe Genetic Code, University of Pisa, Italy,July 18, 2002. (119) Exploiting Redundancyinthe Genetic Code, University of Padua, Italy,June 25, 2002. (120) Calculated Bets: Computers, Gambling, and Mathematical Modeling to Win, Dept. of Com- puter Science Colloquium, Tel AvivUniversity,Tel Aviv, Israel, May 5, 2002. (121) Calculated Bets: Computers, Gambling, and Mathematical Modeling to Win, Dept. of Com- puter Science Colloquium, Bar-Ilan University,Tel Aviv, Israel, May 2, 2002. (122) Combinatorial Dominance and Heuristic Search, Dept. of Computer Science, University of Haifa, Israel, April 22, 2002. (123) Combinatorial Dominance and Heuristic Search, Ben-Gurion University of the Negev, Beer- sheva,Israel, March 19, 2002. (124) Exploiting Redundancyinthe Genetic Code, Ben-Gurion University of the Negev, Beersheva, Israel, March 18, 2002. (125) Exploiting Redundancyinthe Genetic Code, Computational Biology Seminar,HebrewUni- versity,Jerusalem, Israel, March 10, 2002. (126) Calculated Bets: Computers, Gambling, and Mathematical Modeling to Win, IBM Research Lab, Haifa, IL, February 26, 2002. (127) Calculated Bets: Computers, Gambling, and Mathematical Modeling to Win, Univ. de. Valli- dolid, Spain, February 19, 2002. (128) Combinatorics and Graph Theory in Mathematica, Univ. de. Vallidolid, Spain, February 15, 2002. (129) Combinatorial Dominance and Heuristic Search, Theory Seminar,HebrewUniversity, Jerusalem, Israel, January 2, 2002. (130) Calculated Bets: Computers, Gambling, and Mathematical Modeling to Win, (invited speaker) HaifaWinter Workshop on Computer Science and Statistics, Rothchild Institute, HaifaIsrael, December 18, 2001. (131) Exploiting Redundancyinthe Genetic Code, Computational Biology Seminar,The Technion, Haifa, Israel, November 29, 2001 (132) Combinatorial Dominance and Heuristic Search, Center for Graphics and Geometric Comput- ing, The Technion, Haifa, Israel, November 28, 2001 28

(133) Exploiting Redundancyinthe Genetic Code, Computational Biology Seminar,Tel AvivUni- versity,Tel Aviv, Israel, November 21, 2001 (134) Designing Better Phages, Intelligent Systems for Molecular Biology (ISMB 2001), Copen- hagen, Denmark, July 24, 2001. (135) Baccalaureate Honors Convocation, (commencement speaker), SUNY StonyBrook, May 17, 2001. (136) Analysis Techniques for Microarray Time-Series Data, Workshop on Gene Expression Analy- sis (invited speaker), Institute for Pure and Applied Mathematics (IPAM), Department of Mathematics, UCLA, Los Angeles, CA, November 9, 2000. (137) Exploiting Redundancyinthe Genetic Code, Dept. of Computer Scientist, University of Iowa, Iowa City,IO, October 13, 2000. (138) Exploiting Redundancyinthe Genetic Code, Celera Genomics, Rockville, MD, May 23, 2000. (139) Exploiting Redundancyinthe Genetic Code, Dept. of Computer Science University of Cali- fornia, Santa Barbara CA, May 12, 2000. (140) Exploiting Redundancyinthe Genetic Code, Dept. of Computer Science Columbia Universi- ty,New York NY,May 1, 2000. (141) Exploiting Redundancyinthe Genetic Code, Dept. of Medical Informatics Columbia Univer- sity,New York NY,April 25, 2000. (142) Combinatorial Structure Optimization Problems in Biology,ATT Shannon Laboratories, Florham Park, NJ, February 10, 2000. (143) Optimizing Combinatorial Library Construction via Split Synthesis, Universitat Politechica de Catalunya, Barcelona Spain, June 23, 1999. (144) Identifying Gene Regulatory Networks from Experimental Data, IMIM (Medical Research In- stitute), Barcelona Spain, June 22, 1999. (145) Jai-Technology: Computers, Gambling, and Mathematical Modeling to Win, LIMDA, Univer- sitat Politechica de Catalunya, Barcelona Spain, June 2, 1999. (146) Who is interested in algorithms and why?: lessons from the StonyBrook Algorithms Reposi- tory.Workshop on Computational Graph Theory and Combinatorics (invited speaker) Univer- sity of Victoria, B.C. Canada, May 6-8, 1999. (147) Optimization Problems in Combinatorial Chemistry and Gene Regulatory Analysis Depart- ment of Bioinformatics, University of Washington, Seattle WA, May 5, 1999. (148) Algorithmic Resources for Research and Industry,Corning Incorporated, Corning NY,April 26, 1999. (149) Designing and Fabricating Libraries for Combinatorial Chemistry,Department of Computer Science, Polytechnic University,Brooklyn, NY.November 23, 1998. (150) Who is interested in algorithms and why?: lessons from the StonyBrook Algorithms Reposi- tory.Second Workshop on Algorithm Engineering, Saarbrucken, Germany, August 21, 1998. (151) Optimization Problems in Graphics and User-Interfaces, Mitsubishi Electric Research Labora- tory (MERL), Cambridge MA, May 21, 1998. (152) What can I get from Computer Science, Minorities in Engineering and Applied Sciences, SUNY StonyBrook, February 18, 1998. (153) Matching for Run-Length Encoded Strings, Operations Research Seminar,Department of Ap- plied Mathematics, SUNY StonyBrook, October 22, 1997. (154) String Algorithmics for Fun and Profit, Ongoing Research Seminar,SUNY StonyBrook, September 12, 1997. 29

(155) Optimization Problems in Graphics and User-Interfaces, Universidad Politécnica de Madrid, Spain, June 19, 1997. (156) Matching for Run-Length Encoded Strings Sequences ’97,Positano Italy,June 13, 1997. (157) Optimization Problems in Graphics and User-Interfaces, Department of Computer Science, University of Arizona, March 27, 1997. (158) Optmization Problems in Graphics and User-Interfaces, Department of Computer Science, Rensselaer Polytechnic Institute, TroyNY, February 12, 1997. (159) Interactive Sequencing by Hybridization, Electrical Engineering Seminar,SUNY Stony Brook, February 5, 1997. (160) Fabricating Arrays of Strings, RECOMB ’97, Santa Fe, NewMexico, January 20, 1997. (161) Interactive Sequencing by Hybridization, Bar-Ilan Computer Science Forum, Bar-Ilan Univer- sity,Israel, June 9, 1996. (162) Interactive Sequencing By Hybridization, Bat Sheva deRothschild Seminar on Computational Aspects of the Human Genome Project, Nahsholim, Israel, June 6, 1996. (invited speaker) (163) Reconstructing Strings from Substrings in Rounds, IEEE Symposium on Foundations of Computer Science (FOCS), Milwaukee WI, October 25, 1995. (164) Teaching Combinatorics and Graph Theory with Mathematica, at ‘Exploring Formal Methods in the Early Computer Science Curriculum’, CUNY Graduate Center,September 16, 1995. (165) What is Computational Biology?, Ongoing Research Seminar,Dept of Computer Science, SUNY StonyBrook, NY,September 8, 1995. (166) Dialing for Documents: an Experiment in Information Theory,Bellcore, Morristown NJ, June 15, 1995. (167) Adventures in Decision Trees, University of the Baleric Islands, Mallorca Spain, February 24, 1995. (168) Reconstructing Strings from Substrings, and Adventures with Decision Trees, Graph Theory and Computational Geometry Seminars, Universitat Politechica de Catalunya, Barcelona Spain, February 22, 1995. (169) Combinatorics and Graph Theory with Mathematica, Newtechnologies and their influence on teaching mathematics, (invited speaker), Barcelona Spain, February 16, 1995. (170) Dialing for Documents: an Experiment in Information Theory,DIMACS, Rutgers University, January 30, 1995. (171) Dialing for Documents: an Experiment in Information Theory,MIT Media Lab, Boston MA, January 4, 1994. (172) Adventures with Decision Trees, Theory of Computation Seminar,Rutgers University,De- cember 2, 1994. (173) Dialing for Documents: an Experiment in Information Theory,Sev enth ACM SIGGRAPH Symposium on User Interface Software and Technology (UIST ’94), Marina Del ReyCA, November 3, 1994. (174) Dialing for Documents: an Experiment in Information Theory,Environmental Science Re- search Institute, Redlands CA, November 1, 1994. (175) WhyTheory is More Important than Practice: Combinatorial Algorithms and Applications, Department of Computer Science, SUNY StonyBrook, October 28, 1994. (176) Reconstructing Strings from Substrings, DIMACS Workshop on Combinatorial Methods for DNAMapping and Sequencing (invited speaker), Rutgers University,October 7, 1994. 30

(177) Reconstructing Strings from Substrings, Utrecht University,Utrecht, The Netherlands, Octo- ber 3, 1994. (178) Hamiltonian Triangulations for Fast Rendering, Second European Symposium on Algorithms, Arnheim, The Netherlands, September 26, 1994. (179) Reconstructing Strings from Substrings, Princeton University,Princeton NJ, September 14, 1994. (180) Reconstructing Strings from Substrings, DIMACS Tutorial on Computational Biology,Rut- gers University,August 23, 1994. (181) Complexity Aspects of Visibility Graphs, Minisymposium on Visibility Graphs (invited speaker), Seventh SIAM Conf. on Discrete Mathematics, Albuquerque NM, June 25, 1994. (182) Computer Science and DNASequencing, Mathematics Awareness Week, Deer Park High School, Deer Park NY,April 27, 1994. (183) Reconstructing Strings from Substrings, Tow ards DNASequencing Chips, First World Con- gress on Computational Medicine, Public Health, and Biotechnology (invited speaker), Austin TX, April 26, 1994. (184) Reconstructing Strings from Substrings, Department of Computer Science, Rensselaer Poly- technic Institute, TroyNY, February 17, 1994. (185) Workshop with Combinatorica, DIMACS Teachers Program, Rutgers University,December 12, 1993. (186) Reconstructing Strings from Substrings, Department of Computer Science, Pennsylvania State University,State College, PA, November 11, 1993. (187) Reconstructing Strings from Substrings, Biocomputing seminar,DIMACS, Rutgers Universi- ty,October 25, 1993. (188) Combinatorics and Graph Theory with Mathematica, Department of Computer Science North Carolina State University,Raleigh, NC, October 13, 1993. (189) Reconstructing Strings from Substrings, Department of Computer Science, Columbia Univer- sity,New York, NY,September 24, 1993. (190) Reconstructing Strings from Substrings, Combinatorial Computing Seminar,CUNY Graduate Center,New York NY,September 20, 1993. (191) Thinking Backwards 2: Reconstruction Problems in Biology and OCR, Department of Com- puter Science, SUNY StonyBrook, September 10, 1993. (192) Combinatorics and Graph Theory with Mathematica, NSF Regional Geometry Institute, (in- vited speaker), Smith College, Amherst MA, July 6, 1993. (193) Thinking Backwards: Three Geometric Reconstruction Problems, Dept. Applied Mathematics and Physics, Kyoto University,June 4, 1993. (194) Decision Trees for Geometric Objects, Dept. of Information Science, University of Tokyo, May 31, 1993. (195) Combinatorics and Graph Theory with Mathematica, GSSM, University of Tsukuba, Tokyo, Japan, May 28, 1993 (196) Decision Trees for Geometric Objects, Ninth ACM Symposium on Computational Geometry, San Diego CA, May 21, 1993. (197) Thinking Backwards: Three Geometric Reconstruction Problems, Department of Mathemat- ics, SUNY StonyBrook, StonyBrook, NY,February 18, 1993. (198) Decision Trees for Geometric Objects, Workshop on Geometric Probing in Computer Vision, University of Maryland, College Park MD, January 14, 1993. 31

(199) Thinking Backwards: Three Geometric Reconstruction Problems, Department of Computer Science, SUNY StonyBrook, StonyBrook, NY,September 4, 1992. (200) Combinatorics and Graph Theory with Mathematica, Symposium on Calculators and Comput- ers in the Mathematics Curriculum (subplenary speaker), Seventh International Conference on Mathematics Education, Quebec, Canada, August 17, 1992. (201) Combinatorics and Graph Theory with Mathematica, Los Alamos National Laboratory,Los Alamos NM, July 7, 1992. (202) Combinatorics and Graph Theory with Mathematica, Use of Symbolic Computation in Under- graduate Mathematics Education (invited speaker), Denison University,Granville OH, June 26, 1992. (203) Combinatorics and Graph Theory with Mathematica, Conference on Instructional Technolo- gies, SUNY Faculty Access to Computing Technologies (invited speaker), Oneonta NY,May 28, 1992. (204) Combinatorics and Graph Theory with Mathematica, Queen’sUniversity,Kingston Ontario, May 26, 1992. (205) Computational Tools for Research in Discrete Mathematics (panel discussion), DIMACS Workshop on Computational Support for Discrete Mathematics,Rutgers University,March 12, 1992. (206) Analyzing Integer Sequences, DIMACS Workshop on Computational Support for Discrete Mathematics,Rutgers University,March 12, 1992. (207) Software demonstration session, ACM-SIAM Symposium on Discrete Algorithms, Orlando FL, January 27, 1992. (208) Combinatorics and Graph Theory with Mathematica / Finding Square Roots of Graphs, Uni- versity of Waterloo, Waterloo, Ontario, Canada, January 15, 1992. (209) Finding Square Roots of Graphs, Carnegie-Mellon University,Pittsburgh PA, January 13, 1992 (210) Combinatorics and Graph Theory with Mathematica, Minisymposium on Software in Discrete Mathematics (invited speaker), International Conference on Industrial and Applied Mathemat- ics, Washington DC, July 11, 1991. (211) Combinatorics and Graph Theory with Mathematica, NSF Faculty Enhancement Workship, SUNY StonyBrook, June 24, 1991. (212) Inducing Codes From Examples, IEEE Data Compression Conference, Snowbird UT,April 10, 1991. (213) What Do I Do, AMS Faculty Research Interests, Applied Mathematics Department, SUNY StonyBrook NY,January 30, 1991. (214) Combinatorics and Graph Theory with Mathematica, Mathematica Users Conference, San Francisco CA, January 13, 1991. (215) Mathematica Optimization Forum (panel discussion), Mathematica Users Conference, San Francisco CA, January 12, 1991. (216) Combinatorics and Graph Theory with Mathematica, Operations Research Seminar,SUNY StonyBrook, December 5, 1990. (217) Results in Geometric Probing, Workshop on Geometric Probing in Computer Vision, Cornell University,Ithaca NY,November 17, 1990. (218) Combinatorics and Graph Theory with Mathematica, Courant Institute, NewYork University, October 26, 1990. 32

(219) Combinatorics and Graph Theory with Mathematica, Scientific Computing and Automation Conference, (session chair / invited speaker) Philadelphia PA, September 18, 1990. (220) Reconstructing Sets from Interpoint Distances / Combinatorics and Graph Theory in Mathe- matica, Supercomputing Research Center,Lanham MD, August 27, 1990. (221) Reconstructing Sets from Interpoint Distances, Sixth ACM Symposium on Computational Ge- ometry,BerkeleyCA, June 8, 1990. (222) Combinatorics and Graph Theory with Mathematica, Mathematica Users Conference, Red- wood City CA, January 11, 1990. (invited speaker) (223) Reconstructing Sets from Interpoint Distances, Theory Seminar,SUNY StonyBrook, October 13, 1989. (224) Personal Computer of the Year 2000, IBM Kingston NY,August 30, 1989. (225) Personal Computer of the Year 2000, NSF Faculty Enhancement Workship, SUNY Stony Brook, July 27, 1989. (226) Personal Computer of the Year 2000, IBM Hawthorne, White Plains NY,May 31, 1989. (227) Reconstructing Sets from Interpoint Distances, Operations Research Seminar,SUNY Stony Brook, April 26, 1989. (228) Personal Computer of the Year 2000, Summagraphics, Fairfield CT,January 27, 1989. (229) Results in Geometric Probing, IBM T.J.Watson Research Center,Yorktown Heights NY,Jan- uary 17, 1989. (230) Personal Computer of the Year 2000, AClass Act,Reed College, Portland OR, October 17, 1988. (invited presentation). (231) Personal Computer of the Year 2000, Third National CCSSO Conference on Educational Technology,Charlotte NC, September 26, 1988. (invited presentation). (232) Mathematica and the Act of Counting, Apple Computer,Cupertino CA, July 11, 1988. (233) Results in Geometric Probing / Personal Computer of the Year 2000, University of Virginia, Charlotteville VA, April 20, 1988. (234) Geometric Probing (final examination), University of Illinois, Urbana IL, April 13, 1988. (235) Results in Geometric Probing / Personal Computer of the Year 2000, University of Rochester, Rochester NY,April 7-8, 1988. (236) Results in Geometric Probing / Personal Computer of the Year 2000, SUNY Albany, Albany NY,April 6, 1988. (237) Results in Geometric Probing / Personal Computer of the Year 2000, Yale University,New HavenCT, April 5, 1988. (238) Results in Geometric Probing / Personal Computer of the Year 2000, University of Pennsylva- nia, Philadelphia PA, April 4, 1988. (239) Results in Geometric Probing / Personal Computer of the Year 2000, Boston University, Boston MA, March 16, 1988. (240) Results in Geometric Probing / Personal Computer of the Year 2000, SUNY StonyBrook, StonyBrook NY,March 9, 1988. (241) The Personal Computer of the Year 2000, Apple Computer,Cupertino CA, January 28, 1988. (242) Results in Geometric Probing, Supercomputing Research Center,Lanham MD, January 4, 1988. (243) What IDid Last Summer (moderator), Department of Computer Science, University of Illi- nois, Urbana IL, September 2, 1987. 33

(244) Probing Convex Polygons with X-rays, SIAM Conference on Applied Geometry,AlbanyNY, July 22, 1987. (245) Encroaching Lists as a Measure of Presortedness, Combinatorics and Complexity,Chicago IL, June 17, 1987. (246) Probing Convex Polygons with X-rays, Bell Communications Research, Morristown NJ, May 28, 1987. (247) Probing Convex Polygons with X-rays, University of Illinois, Urbana IL, January 28, 1987. (248) Compiler Optimization by Detecting Recursive Subprograms, ACM ’85, DenverCO, October 16, 1985. (249) A Navigation Algorithm for Simple Vision in a Plane, 21st Southeast ACM Conference, Durham NC, April 8, 1983.

Teaching

(1) “Data Science”, StonyBrook, fall 2014, 2016, 2017. Approximately 100 students per semes- ter. (2) “Programming Challenges”, StonyBrook, spring 2001, 2003, 2004, 2005, and 2012. Approx- imately 10 students per semester. (3) “Computational Biology”, StonyBrook, fall 2000, 2002, 2003, 2004, 2005, 2006, 2007, 2009, 2010, 2011, 2012, 2013. Approximately 40-60 students per semester. (4) “Introduction to Computer Technology”, State University of NewYork, StonyBrook, fall 1996. Approximately 125 students per semester. (5) “Seminar in the Analysis of Algorithms”, State University of NewYork, StonyBrook, spring 1992-date, fall 1992-date. Informal reading group, spring 1990 and 1991, fall 1990 and 1991. Approximately 10 students per semester. (6) “Computational Finance”, graduate, State University of NewYork, StonyBrook, spring 2004, 2007. Approximately 15 students per semester. (7) “Advanced Algorithms”, graduate, State University of NewYork, StonyBrook, fall 1995 and spring 1998. Approximately 15 students per semester. (8) “Discrete Mathematics”, undergraduate, State University of NewYork, StonyBrook, fall 1998, 1999. Approximately 150 students per semester. (9) “Discrete Mathematics”, graduate, State University of NewYork, StonyBrook, fall 1989, 1990, 1991, 1992, spring 1999. Approximately 40 students per semester. (10) “Algorithms”, graduate, State University of NewYork, StonyBrook, spring 1989, 1990, 1994, and 1996. Approximately 40 students per semester. (11) “Algorithms”, undergraduate, State University of NewYork, StonyBrook, spring 1992, 1993, 1994, 1996, 1997, 2003, 2006, 2007, 2009, 2010, 2011, 2012, 2013, 2014, 2015 and fall 1998, 1999, 2000, 2004, 2016, 2017. Approximately 200 students per semester. (12) “Data Structures”, State University of NewYork, StonyBrook, fall 1988, 1993, 1997 and spring 1991. Approximately 80 students per semester. (13) “Problems in Combinatorial Algorithms”, SUNY StonyBrook, fall 1988. (14) “Combinatorial Algorithms”, University of Illinois, spring 1987 (teaching assistant). (15) “Introduction to Theory of Computation”, University of Illinois, fall 1986 (teaching assistant). (16) “Introduction to Computer Programming for Graduate Students”, University of Illinois, spring 1984, 1985, and 1986. 34

(17) “Microprocessor Systems”, University of Illinois, fall 1983, 1984, and 1985 (teaching assis- tant). (18) “Introduction to Data Processing”, MiddlesexCountry College, Edison NJ, summer 1981.

Students

(1) David J. Iannouci, masters project: “ANetwork-Distributed Version of GnuChess”. Started September 1988. Completed May 1989. (2) Rajeshwari Bommannavar, masters project: “Two Dimensional High-Data Capacity Barcode Simulation”. Started October 1988. Completed September 1989. (3) Anil Bhansali, masters project: “Combinatorial Mathematica”. Started February 1989. Com- pleted April 1990. Research ProficiencyExam: “Motion Planning and the Volume of Free Space”, passed August 1990. (4) Wai-Hong Leung, masters project: “Inducing Codes from Examples”. Started September 1989. Completed October 1990. (5) Yaw-ling Lin, Ph. D. student. Started February 1990. Research ProficiencyExam: “Graph Model Recognition and Inversion”, December 1990. Prelim August 1992. Completed July 1993. (6) Gopalakrishnan Sundaram, Ph. D. student (Applied Mathematics). Started September 1990. Prelim December 1992, “Combinatorial Algorithms for Computational Biology”. Completed November 1993. (7) Sridhar Balakrishnan, masters project: “Stereoscopic Graphics and Mathematica”. Started February 1992. Completed January 1993. (8) Chin-Chih Chang, masters project: “Parallel Alpha-Beta Search”. Started February 1992. Completed May 1993. (9) Michael Murphy, masters project: “Range Queries and Nearest Neighbor Search in Higher Di- mensions”. Started February 1992. Completed January 1993. (10) Harald Rau, masters project: “Dialing for Documents: Telephones and Information Theory”. Started February 1993. Completed August 1993. (11) Vassili Leonov, masters project: “CASE tools for LINK”. Started February 1993. Completed February 1994. (12) Yin-YiPhilip Juan, masters project: “Algorithms for the LINK library”. Started October 1993. Completed January 1995. (13) Tong Gao, masters project: “Algorithms for the LINK library”. Started October 1993. Com- pleted December 1994. (14) David Wagner,masters project: “Testing for the LINK Library”. Started January 1994. Com- pleted May 1995. (15) Ting Chen, Ph.D. student. Masters project: “Sorting with Fixed-Length Reversals”. Started May 1994. Completed February 1995. Research ProficiencyExam: passed August 1995. passed prelim December 1996. Dissertation: “Computational Genome Analysis”. Completed September 1997. (16) Francine Evans, Ph.D. student. (co-advisor with A. Varshney) Research ProficiencyExam: “Fast Rendering with Triangle Strips”, passed August 1995. passed prelim May 1997. Dis- sertation: “Efficient Interaction Techniques in Virtual Environments”. Completed August 1998. (17) RickyBradley, masters project: “Fabricating Arrays of Strings”, Started May 1995. Complet- ed December 1996. 35

(18) Barry Cohen, Ph.D. student. Started June 1997. Research ProficiencyExam: “Optimizing Split Synthesis for Combinatorial Chemistry”, passed October 1998. Prelim, passed April 2000. Dissertation: “Computing RNAcoding spaces and efficient combinatorial library con- struction”. Completed August 2001 (19) Meenakumari Nagarajan, masters project: “Successful Betting on Jai-alai” Started January 1997. Completed January 1998. (20) Ziping Fu, masters project: “WWW Interface for Overloaded Keyboards”. Started February 1998. Finished December 1998. (21) Vladimir Filkov,Ph.D. student, Started June 1998. Research ProficiencyExam: “Analyzing Gene Expression from Time-Series Data”, passed August 1999. Passed prelim May 2001. Dissertation: “Infering Gene Regulatory Networks”. Completed August 2002. (22) Pav elSumazin, Ph.D. student, Started September 1998. Research ProficiencyExam: “Detect- ing Shift Errors in Standardized Examinations”, passed November 1999. Passed prelim May 2001. Dissertation: “Combinatorial Search in Theory and Practice”. Completed August 2002. (23) Vinhthuy Phan, Ph.D. student, Started July 1999. Research ProficiencyExam: “Interactive Sequencing by Hybridization”, passed November 2000. Dissertation: “Combinatorial Meta- heuristic Search for Metaproblems”. Completed May 2003. (24) Zhi Jizu, masters project: “Edge Detection for Gene Regulatory Analysis from Time-Series Data”, Started March 2000. Completed October 2000. (25) Chi Hsiung (Jemmy) Lee, masters project: “Prediction systems for horse racing”. Started September 2000. Completed August 2001. (26) Rohan Fernandes, M.S. student, Started May 2001. masters project: “Multiuse PCR Primer Design”. Completed December 2002. (27) LevonLloyd, Ph.D. student, Started August 2002. Research ProficiencyExam: passed Au- gust 2003. Prelim: passed May 2005. Dissertation: “Lydia: A System for the Large Scale Analysis of Natural Language Text”. Completed: May 2006. (28) Saumyadipta Pyne, Ph.D. student, Started December 2002. co-advised with Bruce Futcher, Microbiology.Research ProficiencyExam: passed January 2004. Prelim: passed May 2005. Dissertation: “Computational Studies of Gene Expression and Sequence Data in Yeast”. Completed: May 2006. (29) Gregory Longo, masters project: “Buying Strategy for Long-Term Stocks”. Started March 2003. Finished December 2003. (30) Dimitris Papamichail, Ph.D. student, Started June 2003. Research ProficiencyExam: passed May 2004. Prelim: pased May 2006. Dissertation: “Analysis and Design of Genomic Se- quences”. Completed: July 2007. (31) Dimitrios Kechagias, masters project: “Financial analysis portal”. Started July 2004. Fin- ished May 2005. (32) AndrewMehler,Ph.D. student, Started January 2005. Research ProficiencyExam: passed January 2006. Prelim: passed November 2007. Dissertation: Interpreting News through the Science of Networks (33) Manjunath Srinivasaiah, masters thesis: “Large Scale Sentiment Analysis of Natural Language Te xt”. Started January 2005. Finished May 2006. (34) Namrata Godbole, masters thesis: “Web-Based Acquisition and Sentiment Analysis of Natural Language Text”. Started January 2005. Finished May 2006. (35) Prachi Kaulgud, masters thesis: “Named Entity Recognition with Lydia”. Started January 2005. Finished May 2006. 36

(36) Kil Jae Hong, masters project: “Question Answering with Lydia”. Started January 2005. Fin- ished May 2006. (37) Sandesh Devaraju, masters project: “Pipeline and database manangement for Lydia”. Started January 2006. Finished May 2007. (38) Ashwin Subrahmanya, masters project: “Web-generation for TextMap/Lydia”. Started Jan- uary 2006. Finished May 2007. (39) Sushma Devendrappa, masters project: “Textprocessing with Lydia”. Started January 2006. Finished January 2007. (40) Anand Mallangada, masters project: “Relation extraction for Lydia”. Started January 2006. Finished May 2007. (41) AlexTurner,masters project: “News Storm Analysis”. Started January 2006. Finished May 2007. (42) Jai Balani, masters project: “RSS spidering for Lydia”. Started January 2006. Finished June 2007. (43) Lohit Vijaya-renu, masters project: “Sentiment analysis for international news”. Started May 2006. Finished June 2007. (44) Mohammad Sajjad Hossain, Ph.D. student. Started May 2006. Research ProficiencyExam “Short read sequence assembly”: passed November 2007. Finished masters June 2008. (45) Mikhail Bautin, Ph.D. student Started May 2006. Research ProficiencyExam “Concordance- Based Entity Search”: passed May 2007. Prelim June 2008. Disseration: qNews analysis for the Social Sciences”: Finished August 2009. (46) Sriram Kumaran, masters project: “Relation extraction for Lydia”. Started January 2007. Fin- ished May 2008. (47) PaavanShanbhag, masters project: “Pipeline and database manangement for Lydia”. Started January 2007. Finished December 2007. (48) Gayathri Ravichandran-geetha masters project: “RSS spidering for Lydia”. Started January 2007. Finished December 2007. (49) Gopalakrishnan Iyer,masters project: “Web-generation for TextMap/Lydia”. Started January 2007. Finished May 2008. (50) Jahangir Mohammed, masters project: “Group membership identification for news entities”. Started May 2007. Finished July 2008. (51) Swapna Male, masters project: “Group membership identification for news entities”. Started May 2007. Finished December 2008. (52) Wenbin Zhang PhD student. Started September 2007. Research ProficiencyExam “Improv- ing Movie Gross Prediction Through News Analysis”: passed July 2008. Finished August 2010. (53) Anurag Ambekar,masters project: “Name Ethnicity Detection”. Started September 2007. Finished May 2009. (54) Shashank Naik, masters project: “Network analysis for news entities”. Started January 2008. Finished May 2009. (55) Akshay Patil, masters project: “Web generation for TextMap/Lydia”. Started January 2008. Finished May 2009. (56) Shrikant Shanbhag, masters project: “Distributed news indices and search”. Started January 2008. Finished December 2009. (57) Dymtro Molkov,masters project: “Cloud computing infrastructure for Lydia”. Started September 2008. Finished May 2009. 37

(58) Shrijeet Paliwal, masters project: “Blog spidering and analysis with Spinn3r”. Started January 2009. Finished May 2010. (59) Karthik Balaji, masters project: “Distributed Lydia Infrastructure”. Started May 2009. Fin- ished May 2010. (60) Yancheng (Francis) Hong, masters thesis: “Event Forecasting through News Sentiment Analy- sis”. Hong Kong University of Science and Technology.Started January 2009. Finished June 2010. (61) Girish Kathalagiri, masters project:“Access interface for TextMap”. started May 2009. (62) Nicolo Davis, Bala Mudiam, Pritam Damania, VaibhavShrivastava,started January 2010. (63) Goutam Bhat Rohith Memon Sandesh Singh Naresh Singh Started January 2011 (64) Rami al-Rfou, Ph.D. student. Started September 2011. Dissertation: “Polyglot: A Massive Multilingual Natural Language Processing Pipeline”. Defended May 2015. (65) Yanqing Chen, Ph.D. student, Started May 2011. Dissertation: “Natural Language Processing using "Word Connection Networks”. Defended May 2015. (66) Ruhksana Yeasmin, Ph.D. student. Started May 2011. Dissertation: “Synthetic Genome De- sign: Optimization and Analysis”. Defended May 2015. (67) Nikhil Patwardhan masters project:“Watch the Birdie: Birdwatching App for Android”. Start- ed January 2011. Finished December 2011. (68) Ajeesh Elikkottil, Dhruv Matani. Masters students. Started January 2012. (69) Bryan Perozzi Ph.D. student. Started September 2012. Dissertation ‘‘Local Modeling of At- tributed Graphs: Algorithms and Applications’’. Defended May 2016. (70) Viv ekKulkarni, MS/Ph.D. student. Started January 2013. Completed Prelim May 2016. Dis- sertation: ‘‘Statistical Models for Linguistic Variation in Online Media’’. Defended May 2017. (71) Yafei Duan, AbhinavGupta, Wenbin Lin, Moutoupsi Paul, Vishwas Sharma. Masters stu- dents. Started January 2013. (72) Ruhul Mohammad Amin, Ph.D. student. Started June 2014. Completed RPE May 2015. Pre- lim May 2018. Dissertation: ‘‘Homology Based Sequence Alignment and Annotation Algo- rithms’’. Defended June 2019. (73) Samundra Banerjee, Zichao Dai, Vincent Tsuei. Masters students. Started September 2014. (74) Rakesh Chada, Ashish Kulkarni, Ajay Lakshminarayanarao. Mastsers students. Started Jan- uary 2015. (75) Haochen Chen, Ph.D. student. Started September 2015. Completed RPE May 2016. Prelim May 2018. Dissertation: "Network Embeddings: Methods and Applications". Defended April 2019. (76) Yingtao (Alan) Tian, Ph.D. student. Started September 2015. Completed RPE May 2016. Prelim May 2018. Dissertation: "Representation Learning for Modeling Data in Multiple Modalities". Defended May 2019. (77) Arvind Ram Anantharam, Arun Babu, Parth Kamlesh Dandiwala, Arpita Sheth. Masters stu- dents. Started January 2016. (78) Yeseul Lee, Tim Vercruysse. Masters students. Started September 2016. (79) Alisa Yurovsky, Ph.D. student. Started May 2016. Completed RPE August 2017. Prelim May 2020. Dissertation: "Understanding Mechanisms of Translation and Transcription". De- fended December 2020. 38

(80) Junting Ye,Ph.D. student. Completed RPE May 2016. Transfered to me September 2016. Prelim May 2018. Dissertation: "Network Analysis and Modeling for News and Social Me- dia". Defended April 2019. (81) Mayur Ghogale Piyush Khemka Nidhi Mendiratta Siddharth Shah Masters students. Started January 2017. (82) Harsh Agarwal, Prakhar Avasthi, Mohit Goel, Ankur Rastogi, Sahil Sobti. Masters students. Started January 2018. (83) Mohit Choudhary,Abhishek Reddy,Sanjay MathewThomas. Masters students. Started Jan- uary 2019. (84) Xingzhi Guo, Ph.D. student. Started September 2019.

Professional Service

Member,College of Engineering and Applied Science (CEAS) Executive Committee, StonyBrook University,November 2017 to date. Faculty Service Award, Department of Computer Science, StonyBrook University,2006 and 2012. Member,IEEE Computer Society Awards Committee, 2012 to date. IEEE International Conference on Data Mining (ICDM) Awards Committee, 2018. Associate Chairman, Department of Computer Science, SUNY StonyBrook, August 1996 to August 1999. Editorial Board, IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB), 2003 to 2006. Associate Editor, The ACM Journal of Experimental Algorithmics,1995 to 2006. Program Committee for: ACMKnowledgeand Data Discovery (KDD 2020, 2021), Association for Computational Linguistics (ACL 2019, 2021), North American Chapter of the Association for Com- putational Linguistics - Human Language Technologies (NAACL-NLT2018, 2021), IEEE Int. Con- ference Application of Information and Communication Technologies (AICT 2017), ACMConference on Web Search and Data Mining (WSDM 2016), Empirical Methods in Natural Language Processing (EMNLP 2016, 2017, 2018, 2019, 2020), AAAI Conference on Weblogs and Social Media (ICWSM-11), World Wide Web Conference (now The Web Conference)(WWW 2011, 2018, TheWeb- Conf 2019, 2020), WorldCIST Graph Mining Workshop (WorldCIST 2020), Genome Informatics Workshop (GIW 2011, 2016), IEEE Foundations of Computer Sciences (FOCS 2010), Workshop of Social Media Analytics (KDD 2010), Combinatorial Pattern Matching (CPM 2007), Asia-Pacific Bioinformatics Conference (APBC 2016, 2015, 2014, 2010, 2009, 2007), Int. Workshop on Social Me- dia Analytics (SMA 2007, SOMA 2012), Int. Conf.onComputational Biology (RECOMB 2013, 2006, 2005, 2004), (RECOMB-Seq 2014, 2015, 2016), IEEE Conf.onComputational Advances in Bio and Medical Sciences (ICCABS 2013), IEEE Workshop on Issues and Challenges in Social Computing (WICSOC 2013), Symposium on String Processing and Information Retrieval (SPIRE 2011, 2010, 2008, 2006), Workshop on Intelligent Analysis of Processing of Web News Content (WI-IAT2009), Int. Symp. Algorithsm and Computation (ISAAC2010), International Symposium on Bioinformatics Research and Applications (ISBRA 2010, 2009), European Conference on Computational Biology (ECCB 2006, 2005), Int. Conf on Intelligent Systems for Molecular Biology (ISMB 2006), Workshop on Algorithms in Bioinformatics (WABI 2009, 2007, 2005, 2002), Int. Workshop on Experimental and Efficient Algorithms (WEA 2008, 2004, 2003), Fun with Algorithms (FUN 2004, 2010), Workshop on Interdisciplinary Applications of Graphs and Algorithms (2002), Workshop on Algorithm Engineering (WAE 2000), Workshop on Algorithm Engineering and Experimenation (ALENEX 1999), ACMSym- posium on Computational Geometry (SoCG 1998), Int. Conf.onAlgorithms for Computational Biolo- gy (AlCoB 2018, 2014), Computational Intelligence for Big Social Data Analysis (CI4BigData 2016), Int. Conf.onAlgorithms (ICA 1996). 39

Member of the advisory board for Springer Verlag’s‘‘Undergraduate Topics in Computer Science’’ book series, August 2006 to date. Member of the advisory panel for the Southeast University Journal of Computing Sciences (SEUJCS), Dhaka Bangladesh, July 2021 to date. Coach of StonyBrook’sACM International Collegiate Programming Contest (ICPC) teams, 1999 to 2016. Reached 2006 and 2009 World Finals. External Reviewfor NewJerseyInstitute of Technology BS program in Data Science, January 4, 2021. External reviewer for University of Iowa Computer Science Department, March 2-3, 2020. External reviewer for NewJerseyInstitute of Technology’sMSprogram in Bioinformatics, May 22, 2006. External reviewer for Montclair State University’sScience Informatics program, May 15, 2006. External reviewer for Dowling College’sComputer Science program, December 5, 2005. Referee for IEEE Trans. Computers, SIAM J.Computing, Info. Proc. Letters, Algorithmica, J. ACM, Ann. Mathematics and Artificial Intelligence, The Visual Computer, J. Algorithms, Discrete and Com- putational Geometry, Computersand Graphics, Disc. Applied Math., Mathematics Magazine, Pattern Recognition Letters, Acta Informatica, IEEE Trans. Robotics and Automation, ORSA J.Computing, Operations Research,European J.Operations Research,Computational Geometry: Theory and Appli- cations, Information Sciences, SIAM J.Discrete Math, Journal of the London Mathematical Society, Comm. ACM, ComplexSystems, Computing Surveys, Computational Statistics, J.Computational Biol- ogy, ARS Combinatoria, Discrete Math., J.Computer and Systems Sciences, NatureBiotechnology, NatureHuman Behavior,Networks, Proc. Nat. Acad. of Science,Theoretical Computer Science,Irani- an Journal of Electrical and Computer Engineering,Control and Cybernetics, BMC Genomics, BMC Research Notes J.Combinatorial Chemistry,J.Disc. Algorithms, Constraints, J.Global Optimization, IEEE Trans. Automation Science and Engineering,Journal of Computer Science and Technology (China), J.Graph Algorithms and Applications, Software: Practice and Experience,Discrete Opti- mization, Data and KnowledgeEngineering,PLOS Biology,PLOS Computational Biology,ACM Tr ans. KnowledgeDiscovery from Data, J.Math. Biology,Int. J.Computer Games Technology, Genome Biology,J.Web Semantics, ArsMathematica Contemporanea, Theory of Computing Systems, PLOS One,Random Structures and Algorithms, Social Network Analysis and Mining,National Sci- ence Foundation, Natural Sciences and Engineering ReseachCouncil of Canada, Hong Kong Re- search Grants Council, Council of Physical Sciences of the Netherlands Organization for Scientific Research (NWO), Israel Science Foundation (ISF), Indo-US Science and Technology Forum (IUSSTF), Swedish Research Council (VR), Australian Research Council, Maryland TEDCO, Office of Research National Univerisity of Singapore, University of Cyprus, Finland Academy of Sciences, Fall Joint Computing Conference,CVPR ’88, Visualization ’90, SWAT ’96, Graph Drawing ’96, SO- DA ’97, STOC ’97, Approx ’99, Visualization ’99, SODA’00, ESA ’00, RECOMB ’01, COCOON ’01, RECOMB ’02, STOC ’02, SWAT ’02, ESA ’02, SODA’02, Japanese Conf.Discrete/Comp. Geometry ’02, FOCS ’03, WABI ’03, Graph Drawing ’03, SODA’04, FSTTCS ’04, SODA’05, STOC ’05, ESA ’05, ICALP ’06, SODA’07, SODA’11, COCOON ’08, FSTTCS ’08, IWOCA ’10, EUROCOMB ’11, SODA’13, ESA ’13, Yahoo TechPulse ’15, ESA ’16, ICALP ’17, SODA’18. NSF ReviewPanels for Information Technology Research (ITR), Theory of Computing (TOC), Small Business Innovative Research (SBIR), Operations Research / Service Enterprise Engineering, and In- strumentation and Laboratory Improvement (ILI) Programs. NIH P41 ReviewPanel. Local arrangements chair,Tenth ACM Symposium on Computational Geometry,(with Joe Mitchell), StonyBrook, June 6-8, 1994. Organizer,MSI-StonyBrook Workshops on Computational Geometry,(with E. Arkin and J. Mitchell), StonyBrook, October 25-26, 1991, October 23-24, 1992 October 14-16, 1993, and October 20-21, 1995. 40

University of Illinois Senate, 1986-1987. Search Committee for University Librarian, 1987. Contributer to ETS Computer Science GRE, 1985 and 1987. Reviewer for ACM Computing Reviews: March 1984 p. 120, May 1985 pp. 267-268, and August 1990. President, University of Virginia ACM Chapter,1982-1983.