White Paper Com Bioinformatics Needs

White Paper Com Bioinformatics Needs

White Paper Review of Infrastructure and Personnel Recommendations for Biomedical Informatics at the Florida State University, College of Medicine: Looking Forward August 2017 Summary During Spring of 2017, a the request of Jeffrey N. Joyce, Senior Associate Dean for Research and Graduate Programs in the College of Medicine (CoM), Cynthia Vied (Translational Science Laboratory) (TSL) established a committee to gather information about bioinformatics and biostatistics needs of researchers across several areas. This included those researchers that use the omics (proteomics, genomics, metabolomics) resources in the TSL. A committee was established of faculty at FSU that either have or require informatics expertise to give feedback concerning the broadly defined informatics needs in CoM. Including genomics and proteomics needs, health outcomes research and the MRI facility. These are addressed in this white paper. Key points and recommendations of the committee are summarized below: 1. The use of research methods involving bioinformatics and biostatistics at CoM is likely to grow at a continuous rate over the next few years. FSU must develop these capacities in order to keep pace with the integral role of bioinformatics and advanced biostatistics so that we may compete on the national funding scene. Specific areas of projected growth are described in this white paper. 2. The TSL offers cutting-edge omics platforms to researchers but performs bioinformatics analyses on a limited and informal basis. The growth of research involving omics has resulted in more than 20 CoM faculty requesting bioinformatics support from the TSL in the past three years. 3. The FSU Research Computing Center (RCC) provide high performance computing clusters and data storage but significant problems exist for big data storage and analysis. 4. The Center for Genomics and Personalized Medicine (CGPM) at FSU offers genomic bioinformatics services but does not have proteomics or metabolomics expertise and has limited staff. In addition, this facility is not currently equipped to deal with HIPAA and other federal data regulatory guidelines for the use and storage of patient data. 1 | Page 5. In addition to omics research, the demand for health outcomes research has increased dramatically in terms of federal health funding priorities and the growth in large multi-center grants at CoM. 6. The opportunity for integrating omics big data, health and health outcomes data, population and geospatial data will require research leadership that is currently underrepresented at FSU 7. Health outcomes research will require continued addition of expertise in the areas of computer programming, scientific data standardization and aggregation, data storage and security, database structure and state-of-the-science analytics. 8. The newly established CoM MRI facility will generate terabytes of data on a regular basis for long-term storage, requires large computing resources, and networked computers with access to revision controlled MRI-specific software packages. 9. The MRI facility, Health outcomes researchers across campus, and TSL have overlapping aspects that could draw from a common pool of expertise and infrastructure to aid in the individual requirements for each group. One immediate scenario to address these needs is CoM centric, which would be to hire additional staff members for each of the groups to provide specific informatics support. Under this scenario resources would be utilized to hire informaticians whose expertise is specific to omics, health outcomes research, and medical imaging to support researchers within the College of Medicine. In addition to personnel, it would be important to provide additional computing and data storage resources. All of this could be accomplished through an Informatics Center with several established researchers with a history of NIH/NSF funding and the ability to coordinate a diverse group of personnel from data management to PhD level researchers. It would be a combined core resource and research focused center within the College. On the other hand that scenario would not address the long-term larger-scale solution needed at the university level that would include the hire of CoM tenure- track faculty in these areas, but as well as in other units. This would address the university-level needs for bioinformatics, computational biology and biostatistics and have several advantages, including FSU maintaining a cutting-edge intellectual base of faculty and unique training experiences for students at every level. Some of the computing infrastructure already exists at the RCC and could be improved through this endeavor. Additional benefits include an increase in the following: increased competitiveness for federal grants in the STEM areas, collaboration amongst FSU researchers across colleges, program project grant opportunities, greater use of core university facilities, less reliance on sub- contracting with other universities for bioinformatics and biostatistical collaborations, and student training opportunities in a rapidly growing area with many future job opportunities. Given the size of the investment for either 2 | Page scenario, it is prudent to establish a consulting group to address a university- wide solution. Introduction: The rationale for establishing the committee to review the “big data analysis” needs for CoM was based on an initial survey by Jeffrey Joyce, the Senior Associate Dean for Research and Graduate Programs in CoM, of the biostatistical needs of the different departments and core resources (e.g. TSL) of the college. The breadth of research interests, potential for increased external funding, and lack of current resources within CoM make it difficult to address the needs of CoM. Moreover, the current relationship with the University of Florida through our role in the NIH (CTSA) and PCORI (DataTrust) funded awards has demonstrated a lack of expertise at COM (and across FSU) to take advantage of the current resources. In order to better address the big data analytic needs of the researchers within CoM and already existing links to faculty outside of the college, a committee was charged with identifying and evaluating the breadth of data analytics within CoM. The committee was headed by Dr. Cynthia Vied but included contributions from 14 members of the FSU community (see page 12). TSL Bioinformatics: Background / Problems / Projected Growth The TSL is a core facility that provides proteomic, genomic, and metabolomic services for FSU and external researchers. The TSL houses state-of-the art mass spectrometers for proteomic and metabolomic techniques, as well as a next generation sequencing machine for genomic analyses. These omics technologies generate large datasets (~ 35 TB of data per year) that are computationally and statistically demanding to analyze. However, the TSL does not formally offer analysis of the data generated from these services other than basic processing of the data. Researchers that generate data in the TSL need computer software packages and expertise to analyze and visualize these data for publications and grant applications. Most researchers in the biomedical field do not have this expertise and rely on bioinformaticians or biostatisticians to analyze the data. Specific to CoM, approximately 20 BMS faculty have requested or received bioinformatics help for omics data analysis in the past 3 years (Arbeitman, Meckes, Pinto, Horabin, Nowakowski Levenson, Delp, Lee, Wang, Olcese, Bhide, McCarthy, Graham, Stefanovic, Hurt, Pinto, Ren, Kato, Rizkallah, Galasko, Kaplan, Bruck). Some faculty have data analysis expertise within their lab but have requested resources, such as functional analysis software and additional computing capabilities to manage their ongoing needs. These resources are expensive for individual investigators but software site licenses could be purchased so that multiple investigators could benefit. Secure 3 | Page remote login capabilities would allow the researchers to continue the analysis from the laboratory or office. However, most genomic analysis packages never have a graphical- interface software solution, and require cutting edge computer programming skills that will require the hiring of trained personnel. The next generation sequencing machine and one of the mass spectrometers within the TSL operate at maximum capacity with samples waiting in a queue for each machine. A new higher-capacity sequencing machine is expected to be added to the TSL during the 2017-2018 fiscal year, partially funded through startup funds from a new investigator in the Department of Biological Science. This machine can produce up to 50 times more data (as much as 6 TB of data per run) than generated by the current sequencing machine. In addition to a new sequencing machine, the TSL staff will be offering two new research techniques, mass spectrometry imaging and advanced metabolomics capabilities, which will increase the data output and bioinformatics needs substantially within the next year. How have these bioinformatics needs been addressed in the past Faculty members in the TSL have been providing some bioinformatics expertise. Drs. Cynthia Vied and Rakesh Singh have been offering sample preparation and analysis for sequencing and proteomics, respectively, for researchers generating data in the TSL. This system has resulted in more researchers using the TSL than would have done so otherwise. The TSL also offers proteomics software tools on virtual machines accessible to lab users

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    13 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us