Your Sequence Submission Pack

Your Sequence Submission Pack

Your sequence submission pack The purpose of this document is to provide detailed information on the key EGA submission stages for your sequence data. If your submission also consists of array based data that is covered under the same study (publication), we request that you generate your study accession first, using the instructions provided below. The study accession obtained should then be used for your Array based submission. Before progressing please ensure that you read and follow our guidelines on prepar ing your sequence data files . Additional submission help and support can be obtained by emailing EGA-Helpdesk Key stages of sequence submissions in detail Encrypt - Calculate - Upload - Send Key - Document Encrypt Encrypt all your documents and files using GnuPG Contact EGA Helpdesk to obtain the GnuPG public key over email Before uploading your data files to your submission account, all data files must be encrypted using GnuPG. Quick quide to using GnuPG for encryption i) Follow the installation instructions found here . ii) If creating your own key, use the command: gpg –output <filename.gpg> -c <filename> Follow the onscreen prompts and choose the default options, which will create an encrypted copy for each file. If using the EGA public key, import the key by using the command: gpg -- import EGA_Public_Key iii) Now encrypt your files using the command: gpg -e [filename1] [filename2] [etc] If using your own key, enter your UID generated when you created the key in step 2. For EGA public key, enter your UID as ' EGA_Public_Key '. You should now have an encrypted copy for each file, with the suffix *.gpg*. Further information on using GnuPG can be found on their documentation pages here . You can also use the EGA uploader tool to encrypt and generate md5sum values for your files locally (without upload). See the submission tools for more information. Calculate Calculate md5 checksums for files prior and post encry ption (i.e. each file should have two md5 values) The md5sum program is installed by default on most Unix, Linux and Unix like systems. The windows md5sum program is available here . To generate md5sum values for any number of files use the command: md5sum <file1> <file2> <etc> > myvalues.md5 This will create md5sum values for the files listed and save these values into a file called 'myvalues.md5' Please upload your md5sum values to your data upload ac count. Further information on md5sum can be found here . You can also use the EGA uploader tool to encrypt and generate md5sum values for your files locally (without upload). See the submission tools for more information. Upload Upload all your data files into your data upload account. Methods available for uploading data are detailed below. Using Aspera: Downloading the Aspera ascp command line program Aspera is a commercial file transfer protocol that provides faster transfer speeds than ftp over long distances. For short distance file transfers we continue to recommend the use of ftp. The Aspera ascp command line client (Aspera connect) can be downloaded here . Please select the correct operating system. The ascp command line client is distributed as part of the aspera connect high-performance transfer browser plug-in. Using Aspera: Using the Aspera ascp command line program Please note: The ascp command line should be run from within the Aspera directory containing ascp.exe. Your command should look similar to this: ascp -QT -l300M -L- <file to upload> <ega-box-N>@fasp.ega.ebi.ac.uk:/. '-l300M' option sets the upload speed limit to 30MB/s. You may wish to lower this value to increase the reliability of the transfer. '-L-' option is for printing logs out while transferring, <files to upload> can be a file mask (e.g. '/homes/submitter/*.srf) or a list of files. <ega-box-N> is your password protected Aspera login. Add '-k2' switch for transfer restarts Using default ftp command line client in Window 1- Start the command line interpreter: press Win-R, type cmd, hit enter 2- Enter 'ftp ftp-private.ebi.ac.uk' 3- Enter your login 4- Enter your password 5- To see a list of available ftp commands type 'help'. 6- Type 'ls' command to check the content of your submission account. 7- Type 'prompt' to switch off confirmation for each file uploaded. 8- Use 'mput' command to upload files: 'mput *.srf' 9- Use 'bye' command to exit the ftp client. 10-Use 'exit' command to exit the command line interpreter. Using default ftp command line client in Linux/Unix 1- Open a termina l and type 'ftp ftp -private.ebi.ac.uk' 2- Enter your login 3- Enter your password 4- To see a list of available ftp commands type 'help'. 5- Type 'ls' command to check the content of your drop box. 6- Type 'prompt' to switch off confirmation for each file uploaded. 7- Use 'mput' command to upload files: 'mput *.srf' 8- Use 'bye' command to exit the ftp client. Send Key Pass your encryption key to the EGA by post or phone (not required if GnuPG public key used) Please do not pass your encryption key over email. You may use post al/courier services, deliver in person or pass the key over the phone Our contact details: Mr Jeff Almeida-King EGA User Support Officer EMBL-EBI Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD,UK Tel: +44 (0) 1223 494559 Document Provide details of your Study, Samples, E xperiments, Runs/Analysis, Policy and Dataset/s We require the following documentation for your submission: 1 – Policy documentation to enable submission to EGA 2 – Metadata associated with your study 1- Policy documentation Please be advised that the EGA can only archive and distribute your data submitted upon receipt and validation of all policy documentation. These documents provide details of your Data Access Committee (DAC), which will be responsible for granting access to the data, and provide authorisation for your data upload. Further information on DAC’s can be found here. Please see the links below for examples of the required policy documentation that should accompany your submission and may be emailed directly to EGA Helpdesk. Data Access Agreement Data access application form Policy statements 1. Metadata associated with your study **Metadata submitted as xmls or through the Webin tool will be made publicly available to view on the EGA website and other EBI resource/partner websites** Your metadata, which will include details of your samples, experiments, runs/analysis, Data Access Committee (DAC), policy and dataset/s can be provided by two alternative means: i) Online using the EGA Webin tool ii) Creating and submitting XMLs i) Using the EGA Webin tool This online tool enables you to create new and edit existing submissions. Go to the EGA Webin page and log in using your submission account name and password. For the submission of sequencing reads that have been uploaded to you submission upload account: • Go to the ‘New Submission ’ tab • Choose ‘I wish to do a complete submission’ and follow the online prompts, which will guide you through adding information for your study, samples, experiments and runs. • Once completed please register your data access committee (DAC), Data access policy and dataset to conclude your metadata submission. To generate a study accession number (EGASXXXXXXXXXXX), for use in your publication, before your reads have been uploaded: • Go to the ‘New Submission’ tab • Choose ‘I wish to register study’ and follow the online prompts • Your samples, Data Access Committee (DAC) and Data access policy may also be registered before your reads have been uploaded. To use the study accession number in a publication, we suggest the following format: "Genotype data has been deposited at the European Genome-phenome Archive (EGA, http://www.ebi.ac.uk/ega/), which is hosted by the EBI, under accession number EGASXXXXXXXXX." Further information regarding the use of Webin can be found here . What happens after the key submission stages have been completed? Upon the completion of a dataset , your website is prepared, which will point to your study, dataset and Data Access Committee. Once your draft website is completed, a member of the EGA will be in touch before your website goes live to ensure: • Your study is represented accurately • Access to EGA user management tools is provided to the Data Access Committee named contacts • Further information regarding the role of the Data Access Committee can be found here Finally, your data is archived within our databases and prepared for encrypted distribution upon the request of permitted EGA account holders. We strongly advise you NOT to delete your data until we confirm that your data has been successfully archived. ii) Creating and submitting XMLs All m etadata required by the EGA may be collected using our EGA XML's. Submitters are required to prepare, validate and submit the XMLs. Working with XML We recommend manipulating EGA metadata using an XML editor, preferably one with the ability to validate against XML schemas. A good article on choosing an XML editor can be found here . Alternatively, XML can be edited in standard text editors and then checked using an XML validator, e.g. xmllint , a free unix-based XML validator. General concepts: Aliases and center names Every EGA object must be uniquely identified within the submission account using the alias attribute. The aliases can be used in submissions to make references between EGA objects. Please find more information about the use of aliases and center names below: alias attribute : every object should have a name that is unique within your submission account. Once submitted successfully, every alias will be assigned an accession. refname attribute : when an object references another by its alias, the alias goes into the refname attribute.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    10 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us