Documentation

This page contains a brief description of various programs, scripts, software, webservers, standalone and their configuration files included in the OSDD_linux

OSDDLinux, in its current plenitude and pliability, is intended to serve atleast five types of users.

  • 1. Occassional users who often need bioinformatics and cheminformatics tools but wish to keep the tool package as a separate entity from their system. For such cases, OSDDLinux has been made bootable and operable from an external memory device like CD/DVD or USB.
  • 2. Regular users for whom the integration of the tool package with their local system would be a big advantage for analyzing bulk data offline.
  • 3. Windows users desirous of using bioinformatics and cheminformatics tools on linux simultaneously with windows system. OSDDLinux can be used on virtual desktop using VMware, Virtual Box, etc. alongside the activities on windows.
  • 4. Regular Ubuntu users already working in the unix environment who wish to install OSDDLinux bioinformatics and cheminformatics tools in their own Ubuntu version. The informatics tools in OSDDLinux have been made available for download from the OSDDLinux portal.
  • 5. Developers who would like to enhance or customize OSDDLinux for their own needs or new applications have been entertained by providing access to the source code.
  • Overview
  • System Configuration and configuration file
  • Getting Started with OSDD-Linux
  • System requirement
  • Accessing and Connecting to OSDD-Linux
  • Standalone
  • Web server
  • Galaxy
  • Software Applications

    Overview

    OSDD Forum is an initiative with a vision to provide affordable healthcare to the developing world. OSDD initiated OSDD-Linux to provides computational resources for researchers in the field of computer-aided drug design. One of the major challenges for bioinformaticians is to understand and solve biologists problems. At the same time the solution should be such that it is user-friendly so that a person having just a little knowledge of computer can also utilizes it. Though we have tried our best to help the biologist, our programs/services are still far from perfect. Our webservers perform well for single sequence queries or for a small number of sequences but they are unable to perform predictions for the whole genome or proteome (because we can't provide the required CPU time). Moreover, many a times due to the limitation of available bandwidth and other security reasons, users wish to run these servers on their local machines. In an urge to comply with these demands our group is releasing this software package, which is a collection and integration of computer programs developed at our group over the years.

    System Configuration and configuration files
    This section contains a brief description of the directories and subdirectories in order of the hierarchy within the gpsr directory that serves as the home directory for OSDDLinux.For each of the softwares developed, the GPSR distribution package contains three variants, namely the webserver, standalone and the galaxy compatible version. Accordingly, the gpsr directory contains major subdirectories as mentioned below.
    S/No.Brief description:
    1/gpsr/webserver/ This hosts all the necessary files that have been used for building the functional web interface of the software. The directory further contains two subdirectories/gpsr/webserver/cgibin/ having all the executable programs actually performing the algorithm of the software to provide the output results and /gpsr/webserver/cgidocs/ having all the programs that provide the user input to the executable programs in the cgibin and also displaying the output results from them.
    2/gpsr/standalone/ This subdirectory further holds the programs executable on bulk data in the command line terminal.
    3/gpsr/galaxy/ Here the gpsr tools have been integrated into the galaxy package such that the user can run the gpsr programs on the local machine just by clicks within the galaxy interface.
    4/gpsr/base/ The base subdirectory is the location where the gpsr basic package is actually installed and contains the following major subdirectories. /gpsr/base/bin/ all the basic command line executable programs in gpsr are placed here. /gpsr/base/includes/ The file base_env located in this subdirectory defines the environment variable of the softwares used by gpsr programs like perl, svmlight, MEME, MAST, PSIPRED, HMMER, etc. the executable for these being kept in /gpsr/local/bin. /gpsr/base/src/ This is the site for storage of the source code of gpsr meaning that the programs of gpsr are kept as executables here. The program install.pl needs to be executed in order to compile and install the gpsr package.
    5/gpsr/bin/ The purpose of this subdirectory is the housing of unix system commands like ls, grep, cut, wc, etc. This is important, as these system commands have been used in the source code of gpsr.
    6/gpsr/data/ Some basic bioinformatics tools that have been used by the gpsr executables require their own databases for serving their purpose. The best example is that of BLAST tool that takes an input sequence and aligns it to a database of sequences that may be the non-redundant database hosted by ncbi or maybe swissprot or pdb or any other database. All these database flatfiles would be placed under the subdirectory /gpsr/data/blastdata/. Similarly if the user wishes to put another database required for any other tool, for example HHsuite, the user may put the database of HHsuite by creating a new subdirectory /gpsr/data/hhsuitedata/.
    7/gpsr/examples/ The primary objective of keeping this subdirectory is for explaining the usage of the standalone versions of the softwares. For each of the software, an example input file as well as an example output file have been provided.
    8/gpsr/local/ This subdirectory includes interpreters for high level programming language like PERL, PYTHON, C, PHP, mysql, etc. as well as machine learning based softwares like SVMlight, WEKA, etc. all of these kept in /gpsr/local/bin/. The Apache HTTP webserver has been put in a separate subdirectory /gpsr/local/apache.
    9/gpsr/software/ This is the site for bioinformatics as well as cheminformatics application softwares that are regularly used in the research practices, prominent examples being BABEL, PSIPRED, IPKnot, etc.
    10/gpsr/source/
    11/gpsr/temp/ The executable programs in the standalone versions provided by the GPSR package create temporary files for processing the data and providing the output and then finally delete these temporary files. To provide space for these activities, the temp subdirectory provides space for creating, using and then deleting temporary subdirectory for each session of each of the software hosted by GPSR package.

    Getting Started with OSDD-Linux

    OSDDLinux has the basic features of a linux operating system in possession. It can be used after installation on the local system or could be mounted from a peripheral memory device like CD/DVD, USB memory stick, etc. The user can download the ISO image file from the OSDDLinux download page and install or mount OSDDLinux. This also allows the user to run the OSDDLinux on a virtual machine like VMWare or Virtual Box. The mountable property of OSDDLinux makes it convenient to use, portable and local machine independent to a large extent.

    System requirement

    Depending upon the requirement, the user may use OSDDLinux from portable storage devices like CD/DVD, USB drive, etc. or would like to install OSDDLinux on a local machine or in the third case, launch OSDDLinux on virtual machine to work on it in concurrence with another operating system, for example Windows. While for the first two cases the system requirements are same as those instructed by Ubuntu portal, the virtual operation requires installation of a virtual ware like the open source VMware or a commercial application like Virtual Box. The step by step Virtual Box installation procedure is explained below.

    Important links for OSDD-Linux


    Standalone


    The standalone programs are the most useful variants of the bioinformatics and cheminformatics tools for the regular users who need to analyze bulk data on a routine basis. The programs are executable on command line convenient to provide input and yield output in easily read file formats.

    Web server


    The webserver set up has been provided by OSDDLinux to launch on local machine, the interface of the bioinformatics and cheminformatics tools visible online. This imparts user friendly attribute to OSDDLinux as many users of informatics tools prefer graphical interface for submitting input to the bioinformatics tool and obtaining the results in desired formats.

    Galaxy


    The basic purpose of galaxy is data intensive biomedical analysis of data. This platform has been provided for use by graphical interface rather than command line terminal. The facility targets those users who are not unix savvy but wish to analyze heavy data locally. Considering long term advantages of galaxy platform that can be used both online and offline on local machine, OSDDLinux also provides galaxy compatible versions of the bioinformatics and cheminformatics tools.

    Software Applications

    OSDDLinux promotes not only usage but also tool development amongst the research community it provides access to the source codes of the tools it hosts. The source codes for the tools can be retrieved. their programming can be improved and then these can be again integrated back into OSDDLinux by the user-developer for the benefit of the community. Another aspect is the provision of a login account for the user. This is an incentive to those users who wish to actively participate in bioinformatics tool development but are delimited by scarcity of resources like memory space and open source softwares.

  • Webservers

    webservers


    Galaxy

    Galaxy


    Standalone

    Standalone


    All-in-one

    All-in-one