NIAGADS
  • VCPA
  • Introduction
  • Step 1: Set up the Amazon Web Services (AWS) environment
    • 1.1 Create AWS account
    • 1.2 Configure your computing environment and login to AWS
    • 1.3 Setup a S3 bucket (simple storage solution for AWS) for hosting sequencing data
    • 1.4 Install AWS command line software for accessing S3 bucket via command line interface
    • 1.5 Install StarCluster for AWS instance provisioning (optional)
  • Step 2: Create your tracking database instance
    • Option 1: Setup sample tracking database using Public AMI (recommended)
    • Option 2: Setup sample tracking database using Docker
  • Step 3: Configure your project information in the tracking database
    • 3.1 List all projects in the tracking database
    • 3.2 Create the project in the tracking database
  • Step 4: Upload sequencing data to your S3 bucket
  • Step 5: Configure your samples information in the tracking database
    • 5.1 Input the sample information to the tracking database
    • 5.2 Populate the tracking database with the S3 paths for the samples to be processed
    • 5.3 Populate the tracking database with the designated result folder for each sample to be processed
    • 5.4 Input PCR protocol information into the tracking database
    • 5.5 Add the capture kit information (WES sample only) into the tracking database
    • 5.6 Generate an ID to represent the capture kit information (WES sample only)
  • Step 6: Submit a job to process one whole genome (WGS) / whole exome (WES) sample
    • 6.1 Update vcpa-pipeline bitbucket contents
    • 6.2 Choose which workflow to use
    • 6.3 Enter your AWS credentials into the workflow script
    • 6.4 Launch Amazon EC2 Spot Instances via starcluster
  • Step 7: Review quality metrics of processed data
  • Step 8: Generating Project-level VCF via joint genotyping
  • Optional: Change software versions and dependencies of the VCPA workflow
Powered by GitBook
On this page
  1. Step 6: Submit a job to process one whole genome (WGS) / whole exome (WES) sample

6.4 Launch Amazon EC2 Spot Instances via starcluster

Finally, users can launch a Amazon EC2 Spot Instance via starcluster, to process a WGS / WES sample using the selected workflow from section 6.2. The following steps are required:

1) Users need to copy the SGE plugins into the starcluster folder in the tracking database instance:

 cp -r /path/to/vcpa-pipeline/plugins ~/.starcluster/

2) Go into the VCPA pipeline bin folder:

 cd /path/to/vcpa-pipeline/bin

3) Users can launch the Amazon EC2 spot instance to process one sample using the following command:

 starcluster start -c smallcluster --bid=${price} -U ${worflow_path} -P ${HOST_PREFIX}-${PROJECT_ID}-${SAMPLE_NAME}

price: bidding cost of the instance

workflow_path: /path/to/vcpa-pipeline/bin/${PROJECT_WORKFLOW}_aws.sh

HOST_PREFIX: host name for the tracking database

project_id: this is the project ID outputted by Section 3.2

sample_name: sample name (note this needs to match the sample name of the input file in the S3 bucket)

Previous6.3 Enter your AWS credentials into the workflow scriptNextStep 7: Review quality metrics of processed data

Last updated 6 years ago