<<

ChimericSeq Quick User Guide for Windows Fwu-Shan Shieh Version 4.1

System and Data Requirements

System memory: The minimum system memory requirement is 8G bytes.

Hard disk free space: a. 10G bytes for Human Reference genome and ChimericSeq application. b. 2 times of test data size space for holding original test data and alignment data. c. If you need to test data into smaller files, it will need 3 times of test data size space for holding original test data, split files, & alignment data.

Test data filename restriction: During the data analyzing, the system uses part of the test data filename to create a new folder to store test results. The new folder name is the test data subtract the last character. To prevent the test results to be overridden, sure that 1. The test data file basename is than 1 character. 2. The test data file basename with last character removed is unique. 3. For paired data file, the forward read ends in 1.fastq, and the reverse read ends in 2.fastq (such as Test_R1.fastq for forward read file and Test_R2.fastq for reverse read file). For example: 1. If the test data is called Test_R1.fastq, then the basename is “Test_R1” and extension is “fastq”. So, the new folder created to store test results is called “Test_R” (Test_R1.fastq). 2. The test data files “Patient_A.fastq” and “Patient_B.fastq” will use same folder called “Patient_”.

Reads limit: For better performance, the maximum number of reads is 1 million for a system with 8G memory, 2 million for a system with 16G memory, and 4 million for a system with 32G memory. For large files, the user may use File's "Split Large File" to split paired files into smaller files and process.

Download ChimericSeq

Download ChimericSeq application from JBS https://jbs-science.com/software/

Run the downloaded file ChimericSeq_zip.exe file to ChimericSeq application.

Specify a folder to store ChimericSeq application. A new folder called “ChimericSeq” will be created under the specified folder. The example below shows a new folder “c:\ChimericSeq” will be used to store application.

Download zipped Viral reference file

Unzip the Viral reference file to c:\ChimericSeq\Viral_Reference folder. The c:\ChimericSeq\Viral_Reference folder was created during ChimericSeq installation.

Use the same step to download zipped Human reference and unzip the Human reference file to C:\ChimericSeq\Host_Reference folder.

Download zipped Human GTF file and unzip it to C:\ChimericSeq\Host_Reference folder.

Run the application ChimericSeq.exe (located in the new created c:\ChrimericSeq folder).

Two windows will be displayed. Window A is Python shell window and B is ChimericSeq application.

Host & Viral index files

The user may either build the index files or download pre-build index files.

A. Build Index files -- Use “Options/Set Locations” to set Viral Reference, Reference, & Host GTF file and then “build”.

Set file locations.

Click “Build” to build viral index files.

Click “” to confirm the build.

The progress of the build will be displayed application log window.

After the build complete, 6 viral index files (viralRef.1.bt2, viralRef.2.bt2, viralRef.3.bt2, viralRef.4.bt2, viralRef.rev.1.bt2, & viralRef.rev.2.bt2,) will be created and placed in the Viral Index Directory indicated above.

Use same step to build Human index files.

NOTES:

1) If the index files fail to build you may need to properly set the paths for perl and Python for your system. HERE are directions for setting your in Windows for this purpose. 2) It takes more than 2 hours to build Host index files. For a slow system, it could be even longer.

After the build complete, 6 host index files (humanlRef.1.bt2, humanRef.2.bt2, humanRef.3.bt2, humanRef.4.bt2, humanRef.rev.1.bt2, & humanRef.rev.2.bt2,) will be created.

B. Download Index Files

The user may also download and unzip the zipped index files from JBS web page https://jbs- science.com/software/.

1. Download HumanRef.zip and unzip files into the “Host Index Directory” specified in “Options/Set Locations” window. 2. Download ViralRef.zip and unzip files into the “Viral Index Directory” specified in “Options/Set Locations” window.

Test Installation

To test installation, download Samples reads and unzip them (Test_R1.fastq & Test_R2.fastq) to c:\ChimericSeq folder.

Click “…” button to open file dialog window and select test paired files Test_R1.fastq and Test_R2.fastq.

Click “Start Run” to run the test. You may need to click the “Yes” button 2 times to confirm the process.

Result:

Note:

The test result will not be saved until the user select the “File/Save” option.

Handle Large Test Data Files

For better performance, the maximum number of reads is 1 million for a system with 8G memory, 2 million for a system with 16G memory, and 4 million for a system with 32G memory. For large files, the user may use File's "Split Large File" to split paired files into smaller files and process. The “Split Large File” process will split large file into smaller files with 1 million reads of each file.

Use “Split Large File” option to access split file window.

In split file window, click the “ . . . “ button to open file dialog window to select paired files.

After desired files selected, click “Split Files” button to split files.

When the split process is complete, a new folder is created to new files.

To process split files: (1) check the “Select Directory” option, (2) click “ . . . “ button to open file dialog to select a desired folder, and (3) click “Start Run” to process.

Unlike the single file process, the result from split files process will be saved automatically.