SeqWare Portal
 
SeqWare Project
 
Commercial Partner
 

The SeqWare Virtual Machine

Overview

This is the landing page for the SeqWare VM prepared by Nimbus Informatics for the SeqWare project.

The goal of this VM is two-fold. First, we want you to be able to see what a completely configured SeqWare environment looks like and to get to try out the various sub-projects without having to go through a lengthy install process (I'm looking at you Globus Toolkit!). Second, we wanted you to have a fully-functioning single-node cluster environment that you can use, along with our HelloWorld workflow template, to create and test new Workflow Bundles for SeqWare Pipeline. If you are a Nimbus Informatics customer you can even inject your finished workflow bundles into our system for transparently running on EC2! For commercial services for SeqWare you can Contact us at Nimbus Informatics and for the Open Source SeqWare project you can Contact the project via our community page on GitHub.

Installed Tools

This VM includes the following tools pre-installed:

  • •SeqWare MetaDB
  • •SeqWare Pipeline (and all of it's many dependencies)
  • •SeqWare Portal
  • •SeqWare Web Service
  • •SeqWare Query Engine

Documentation

Please start by reading our documentation on our public GitHub site, at least the follow the three tutorials in this order: the User Tutorial, the Developer Tutorial, and the Admin Tutorial.

Code

You can find our source code checked out in:

/home/seqware/gitroot/seqware

You can find more information on the SeqWare developer website.

SeqWare MetaDB

The MetaDB is accessible the the command line. The password is "seqware".
psql -U seqware -W seqware_meta_db

SeqWare Pipeline

The Pipeline is a developer tool and is, therefore, focused on the command line. The goal is to give you a toolset that you can use to make workflow bundles that process NGS data in a cluster and environment agnostic way. You will find a sample workflow bundle in the following directory:

/home/seqware/provisioned-bundles

You can find a ton of information about workflow development on the SeqWare site, see our Documentation page for more information. In particular we suggest all Pipeline users follow the three tutorials in this order: the User Tutorial, the Developer Tutorial, and the Admin Tutorial.

SeqWare Pipeline - Oozie

The workflow engine for SeqWare Pipeline is also installed on this VM. This backend is based on Hadoop with Oozie as the scheduling engine, providing a fast and simple workflow execution environment that should be transparent for workflow authors to use. This backend has a web interface that makes it very easy to monitor workflows. See the oozie console at:

http://hostname:11000/oozie/

Hue provides a nice, alternative view on Oozie workflows, you can reach it here:

http://hostname:8888/oozie/

The username and password are both "seqware". Once you log in click on the Oozie icon at the top to the screen to see the queue and to mon itor your running SeqWare workflows.

You can find a ton of information about workflow development on the SeqWare site, see our Documentation page for more information.

The following Hadoop-related URLs may also prove useful:

SeqWare Portal

The Portal is a lightweight LIMS system that is focused on showing you the status of workflows and the data they produce. It is located at http://hostname:8080/SeqWarePortal. Please login with username "admin@admin.com" and password "admin". If you run workflows using SeqWare Pipeline you will see results through the Portal.

SeqWare Web Service

The Web Service is located at http://hostname:8080/SeqWareWebService. Please login with username "admin@admin.com" and password "admin". This is a RESTful web service so functionality through the web browser is generally limited. It is designed for programmatic access and is used by our command line tools in the SeqWare Pipeline project. For details about the API please see our API documentation.

SeqWare Query Engine

The Query Engine is an HBase database and associated tools that let you store and query genomic variants (SNVs and small indels) along with their annotations (non-synonymous, etc). It is very, very much a work in progress. If you want to take a look at the code see:

/home/seqware/gitroot/seqware/seqware-queryengine

This VM includes Cloudera's CDH4 pre-installed so Hadoop and HBase should be ready to go. Take a look at the SeqWare Query Engine documentation page for examples of loading variants into this database.

About Nimbus Informatics

This VM was prepared by Nimbus Informatics. You can find more information about Nimbus Informatics, our services, our commitment to open science, and our utilization of cutting edge cloud technologies on our company website at http://nimbusinformatics.com. The core service we provide is custom workflow creation and managed cloud-based hosting of those workflows.

Contact Us

Please feel free to contact us at help@nimbusinformatics.com if you have any questions about our cloud computing services or commercial support for SeqWare.

For questions about the open source SeqWare project contact us via our community page on GitHub. For questions on the documentation, please leave comments in the pages and for issues with the tools log bugs in our public tracker. You can contact the developers using our mailing lists.