Bioinformatics: Biosoftwares

Wired Marker:

Wired-Marker is a permanent (indelible) highlighter that you use on Web pages. The highlighter, which comes in various colors and styles, is a kind of electronic bookmark that serves as a guide when you revisit a Web page. The highlighted content is automatically recorded in a scrapbook and saved.
Wired-Marker is a freeware that was developed as part of the Integrated Database Project sponsored by the Ministry of Education, Culture, Sports, Science and Technology (development code name: ScrapParty) for supporting the construction of databases.
Ministry of Education, Culture, Sports, Science and Technology (2006) Integrated Database Project “Integrated Database for Life Sciences” II-3 Technology Development for Curator Support.

Browser: Mozilla Firefox

Licence: Freeware

http://www.wired-marker.org/en/
Download:
https://addons.mozilla.org/en-US/firefox/addon/6219/

BioClipse:

Bioclipse is a free, open source, workbench for chemo- and bioinformatics with powerful editing and visualization capabilities for molecules, sequences, proteins, spectra etc.

The Bioclipse project is aimed at creating a Java-based, open source, visual platform for chemo- and bioinformatics based on the Eclipse Rich Client Platform (RCP). Bioclipse, as any RCP application, is based on a plugin architecture that inherits basic functionality and visual interfaces from Eclipse, such as help system, software updates, preferences, cross-platform deployment etc.

Bioclipse will provide functionality for chemo- and bioinformatics, and extension points that easily can be extended by plugins to provide added functionality. The first version of Bioclipse includes a CDK-plugin (bc_cdk) to provide a chemoinformatic backend, a Jmol-plugin (bc_jmol) for 3D-visualization and a general logging plugin.

The major features are:

    * Import and export in various file formats
    * Visual editing of molecular 2D-structures
    * 3D-visualization of molecules and proteins
    * Editing and visualization of sequences and features (DNA, RNA, proteins etc)
    * Graphing and editing of various types of spectra, e. g. NMR, MS
    * Retrieval of resources (sequences, proteins, etc) from public data repositories
    * Scripting of 3D-visualizations with syntax highlighting and content assistance
    * PDB-editor with syntax highlighting for working with PDB files
    * CMLRSS-viewer for downloading chemical content published on the web using RSS-feeds
    * Chemtree for displaying a hierarchical view of molecular and macromolecular substructures
    * Visualization of syandard chemical properties
    * Powerful scripting language based on Mozilla Rhino for automating tasks
      Integrated, searchable help-system
    * Connection with external programs, e. g. PyMol

Bioclipse is a rich client, which means it is run on your local computer but also gives the possibility to communicate with servers for data retrieval and computational services. The powerful plugin architecture is based on Eclipse[2], and results in a responsive, integrated user interface designed for simple and intuitive operations that at the same time is easy to extend with custom functionality.

OS: Linux, Unix and Windows

Licence: Open Source (EPL)

http://www.bioclipse.net/

Download:
http://sourceforge.net/projects/bioclipse/files/

CORAL: aligning conserved core regions across domain familiesHomologous protein families share highly conserved sequence and structure regions that are frequent targets for comparative analysis of related proteins and families. Many protein families, such as the curated domain families in the Conserved Domain Database (CDD), exhibit similar structural cores. To improve accuracy in aligning such protein families, we propose a profile–profile method CORAL that aligns individual core regions as gap-free units.

CORAL computes optimal local alignment of two profiles with heuristics to preserve continuity within core regions. We benchmarked its performance on curated domains in CDD, which have pre-defined core regions, against COMPASS, HHalign and PSI-BLAST, using structure superpositions and comprehensive curator-optimized alignments as standards of truth. CORAL improves alignment accuracy on core regions over general profile methods, returning a balanced score of 0.57 for over 80% of all domain families in CDD, compared with the highest balanced score of 0.45 from other methods. Further, CORAL provides E-values to aid in detecting homologous protein families and, by respecting block boundaries, produces alignments with improved ‘readability’ that facilitate manual refinement.

OS: Windows, Mac OS

Licence: Freeware

http://www.ncbi.nlm.nih.gov/Structure/cdtree/cdtree.shtml

Download:
http://www.ncbi.nlm.nih.gov/Structure/cdtree/cdtree.shtml

BioConductor:

Bioconductor is an open source and open development software project to provide tools for the analysis and comprehension of genomic data.

Bioconductor is based primarily on the R programming language, but does contain contributions in other programming languages. It has two releases each year that follow the biannual releases of R. At any one time there is a release version, which corresponds to the released version of R, and a development version, which corresponds to the development version of R. Most users will find the release version appropriate for their needs. In addition there are a large number of meta-data packages available that are mainly, but not solely, oriented towards different types of microarrays.

Most Bioconductor components are distributed as R packages, which are add-on modules for R. Initially most of the Bioconductor software packages focused primarily on DNA microarray data analysis. As the project has matured, the functional scope of the software packages broadened to include the analysis of all types of genomic data, such as SAGE, sequence, or SNP data.

Main Features:

# The R Project for Statistical Computing. R and the R package system provides a broad range of advantages to the Bioconductor project including:

* It contains a high-level interpreted language in which one can easily and quickly prototype new computational methods.
* It includes a well established system for packaging together software components and documentation.
* It can address the diversity and complexity of computational biology and bioinformatics problems in a common object-oriented framework.
* It provides to on-line computational biology and bioinformatics data sources.
* It supports a rich set of statistical simulation and modeling activities.
* It contains cutting edge data and model visualization capabilities.
* It has been the basis for pathbreaking research in parallel statistical computing.
* It is under very active development by a dedicated team of researchers with a strong commitment to good documentation and software design.

# Documentation and reproducible research. Each Bioconductor package contains at least one vignette, which is a document that provides a textual, task-oriented description of the package’s functionality. These vignettes come in several forms. Many are simple “HowTo”s that are designed to demonstrate how a particular task can be accomplished with that package’s software. Others provide a more thorough overview of the package or might even discuss general issues related to the package. In the future, we are looking towards providing vignettes that are not specifically tied to a package, but rather are demonstrating more complex concepts. As with all aspects of the Bioconductor project, users are encouraged to participate in this effort.

# Statistical and graphical methods. The Bioconductor project aims to provide access to a wide range of powerful statistical and graphical methods for the analysis of genomic data. Analysis packages are available for: pre-processing Affymetrix and cDNA array data; identifying differentially expressed genes; graph theoretical analyses; plotting genomic data. In addition, the R package system itself provides implementations for a broad range of state-of-the-art statistical and graphical techniques, including linear and non-linear modeling, cluster analysis, prediction, resampling, survival analysis, and time-series analysis.

# Annotation. The Bioconductor project provides software for associating microarray and other genomic data in real time to biological metadata from web databases such as GenBank, LocusLink and PubMed (annotate package). Functions are also provided for incorporating the results of statistical analysis in HTML reports with links to annotation WWW resources.
Software tools are available for assembling and processing genomic annotation data, from databases such as GenBank, the Gene Ontology Consortium, LocusLink, UniGene, the UCSC Human Genome Project (AnnotationDbi package).
Data packages are distributed to provide mappings between different probe identifiers (e.g. Affy IDs, LocusLink, PubMed). Customized annotation libraries can also be assembled.

# Bioconductor short courses. The Bioconductor project has developed a program of short courses on software and statistical methods for the analysis of genomic data. Courses have been given for audiences with backgrounds in either biology or statistics. All course materials (lectures and computer labs) are available on the WWW. Customized short courses may also be designed for interested parties.

# Open source. The Bioconductor project has a commitment to full open source discipline, with distribution via a SourceForge-like platform. All contributions are expected to exist under an open source license such as Artistic 2.0, GPL2, or BSD. There are many different reasons why open–source software is beneficial to the analysis of microarray data and to computational biology in general. The reasons include:
* To provide full access to algorithms and their implementation
* To facilitate software improvements through bug fixing and software extension
* To encourage good scientific computing and statistical practice by providing appropriate tools and instruction
* To provide a workbench of tools that allow researchers to explore and expand the methods used to analyze biological data
* To ensure that the international scientific community is the owner of the software tools needed to carry out research
* To lead and encourage commercial support and development of those tools that are successful
* To promote reproducible research by providing open and accessible tools with which to carry out that research (reproducible research is distinct from independent verification)

OS: Linux, OS X, and Windows

Licence: Open Source (Artistic 2.0, GPL2, or BSD contributions)
http://www.bioconductor.org/
Download:
http://www.bioconductor.org/download/

Genomicus – a new synteny browser with broad phylogenic tree visualization

comparative genomics study the evolution of gene organization, which in the past years was by the fast growing number of full genome sequences available in public repositories. Most of the available bioinformatics tools for visualization and comparison of genomes are restricted to two or three genomes at a time, which is a significant limitation in the field of comparative genomics. Recently (this month) a new phylogenetic software was announced – Genomicus. It provides functionality of a synteny browser, which can represent and compare unlimited numbers of genomes in a broad phylogenetic view. The tool provides reconstructed ancestral gene organization, in a way that facilitates the interpretation of the analysis data.

The comparative genomics aims to:

    * to identify conserved functional regions
    * to document differences among these functional sequences as a first step to understand broader biological differences (metabolic, developmental, etc.) between organisms
    * to outline the evolutionary events that have interrupted the gene colinearity between the genomes of two species since their closest common ancestry dna.

Data integrated in Genomicus

Genomicus is based on already integrated and publicly available from the Ensembl (Ensembl Phylogenetic Trees). Genomicus utilize just two main types of information from Ensembl:

    * gene positions in the respective genomes
    * phylogenetic relationships (orthology, paralogy) between genes.

Genomicus then edits Ensembl phylogenetic trees in the following steps:

   1. Selection of duplication nodes below a defined threshold, that is optimized to increase the synteny between extant genomes.
   2. Integration of information from Boreoeutheria, Euarchontoglires and Atlantogenata ancestral nodes in existing trees of placental mammals
   3. Addition of genomic information from some extant species that are not currently referenced in Ensembl (Branchiostoma floridae, Nematostella vectensis and Oikopleura dioica), together with their respective ancestral nodes.

For each of the integrated genomes, best reciprocal protein BLAST comparisons are performed.

Functionality

Genomicus offers two types of data visualization

    * PhyloView – a phylogenetic tree visualization, which shows the chosen reference gene in the centre of the display with 15 neighbouring genes on both side. It displays information about as well as orthologs and paralogs of the query gene in their own respective genomic regions. If these neighbouring genes are orthologs or paralogs of genes in the reference species, they are displayed with matching colours. When an analog of the reference gene is duplicated (shown as a red square), the originating specie may appear twice in the screen.
    * AlignView – provides an alignments between the genes contained within the genomic region of the reference gene and all their respective orthologs in other species.

In both views, the phylogenetic tree can be edited (expanding, collapsing, hiding, showing chosen nodes) to clarify the view.

Genomicus also displays orthologous conserved non-coding elements at three levels of conservation. In a ddition a link-out functionality to other browsers like Ensembl, UCSC and NCBI is provided.

System Implementation

Genomicus is a web based application, developed with Perl scripts and modules, executed with mod_perl on an Apache2 server. The data is persisted in a MySQL database. The visualization is based on embed inline-SVG drawings in XHTML, in which the JavaScript usage is limited to an information panel retrieved with AJAX calls. The browsers are required to support the Google Chrome Frame extension.

Availability
Genomicus is freely available for online use at http://www.dyogen.ens.fr/genomicus.
The data can be downloaded at ftp://ftp.biologie.ens.fr/pub/dyogen/genomicus.

BioEdit – sequence alignment editor

BioEdit easy-to-use sequence alignment editor and sequence analysis program designed and written by a graduate student who knows how frustrating and time consuming it can be to rely upon word-processors and command-line programs for sequence manipulation. BioEdit is intended to supply a single program that can handle most simple sequence and alignment editing and manipulation functions that researchers are likely to do on a daily basis, as well as a few basic sequences analyses. BioEdit offers a variety of useful features:

    * Four modes of manual alignment: select and slide, dynamic grab and drag, gap insert and delete by mouse click, and on-screen typing which behaves like a text editor.
    * In-color alignment and editing with separate nucleic acid and amino acid color tables and full control over background colors.
    * Plasmid drawing interface for automated creation of plasmid vector graphic from a DNA sequence. Easily mark positions, add features with arrows and curved boxes, and mark restriction enzyme cut sites. Also show detail of polylinker and draw moveable arrows and shapes with drawing tools.
    * Dynamic information-based alignment shading.
    * Point-and-click color table editing
    * Display and print ABI chromatograms with professional-looking output.
    * Group sequences into groups or families.
    * Lock alignment of grouped sequences for synchronized hand alignment adjustments.
    * Annotate sequences with graphical features with dynamic view in alignment windows including feature annotation information tooltips.
    * Lock sequences to prevent accidental edits.
    * Specify characters to be considered valid for calculations in amino acid and nucleotide sequences.
    * Sort sequences by name, LOCUS, DEFINITION, ACCESSION, PID/NID, REFERENCES, COMMENTS or by residue frequency in a selected column.
    * Merge alignments through a reference sequence.
    * Append one alignment to the end of another.
    * Rudimentary phylogenetic tree viewer (for phylip-format trees) that allows node flipping and printing.
    * Verbally read back sequences in single sequence editor to verify hand-typed sequence entries.
    * Reads and writes Genbank, Fasta, Phylip 3.2, Phylip 4, and NBRF/PIR formats. Now also reads GCG and Clustal formats
    * Utilizes Don Gilbert’s ReadSeq to automatically import and export 11 additional formats, including MSF, ASN.1, IG/Stanford and EMBL.
    * Allows import of compatible formats directly from the clipboard without saving to a file first.
    * Easy customization of menu shortcuts for editor window
    * RNA comparative analysis, including covariation, potential pairings and mutual information analysis (currently capable of generating matrices up to 10,000 x 10,000 — but this would be a 600+ Mb file) with matrix plotter for 2-D matrix output tables and area graphing for individual rows of a data matrix. Matrix plotter and line graphs both have point-and-click data selection and the matrix plotter and 1-D line graphs of matrix data are now dynamically linked
    * View sections of very large matrices with plotter (tested on up to a 5183 x 5183 matrix = 180 Mb file)
    * View and manipulate alignments up to 20,000 sequences.
    * Binary file format (BioEdit Project format) for fast open and save of large alignments — the 6205 sequences of the prokaryotic 16S rRNA alignment (29 Mb file) open and save in less than 10 sec on a 233 MHz Pentium.
    * ORF searching with user-defined preferences
    * Formatted translations of nucleic acid sequences with codon usage summary, choice of one- or three-letter amino acid codes, translation of selected region only of nucleic acid, and choice of start/stop codons
    * Split window view for simultaneous and synchronized editing of two different places in the same file — split window vertically or horizontally
    * Amino acid and nucleotide composition analyses and plots
    * Align protein-encoding nucleic acid sequences through amino acid translation.
    * ClustalW multiple sequence alignment (interface internal, external program by Des Higgins et. al.) with auto-update of aligned protein full titles and GenBank field information, as well as nucleotide coding sequence when aligned from a protein view of nucleotide sequences.
    * Protein hydrophobicity/hydrophilicity plots
    * Protein hydrophobic moment matrix plots (0-180 dgrees)
    * Full choice of system fonts now available in edit window
    * Restriction mapping with any or all-frame translation, multiple enzyme choice and output options, and circular DNA capability
    * Browse restriction enzymes by manufacturer
    * Sequences at least 4.6 Mb in length can be manipulated (the largest sequence tested so far is the E. coli genome (4.6 Mb) — E. coli was opened, reverse complemented, translated into 10,125 codon stretches >=100 amino acids, and opened and saved with full GenBank annotation).
    * Six-frame translations capable of raw translation of entire genomes (tested with the E. coli genome — ca. 4.6 Mb)
    * Save GenBank format Entrez files with LOCUS, DEFINITION, ACCESSION, PID, NID, DBSOURCE, KEYWORDS, SOURCE, REFERENCE, COMMENT, and FEATURES fields intact. Modify or add your own information. Multiple sequence files saved in GenBank format retain any entered information.   This information is also saved in the BioEdit Project file format.
    * Configure and run accessory applications via the BioEdit graphical application configuration interface. BioEdit currently comes with:

    *

          o TreeView
          o CAP assembly
          o FastDNml
          o Phylip programs including:

                + DNADist
                + DNAmlk
                + Fitch
                + Kitch
                + ProtDist
                + ProtPars

    * Full NCBI package of local BLAST programs, database creation, and internet BLAST client 2.0, with sample protein database of E. coli open reading frames.
    * Shaded graphical output with identity and similarity (for protein) shading and several formatting options.
    * Rich text export of formatted, shaded alignments
    * On-line help system (always a couple of versions behind the program).
    * Entropy (information lack) plotting.
    * Multiple document interface.
    * Basic sequence manipulations (reverse/complement, translate, DNA->RNA->DNA)
    * Easy text export and configurable text printing.

OS: Windows 95/98/NT/2000/XP

http://www.mbio.ncsu.edu/BioEdit/bioedit.html

Download:
http://www.mbio.ncsu.edu/BioEdit/BioEdit.zip

NoteExpress – scientific information manager

NoteExpress is a perfect assistant and information manager for researchers, scholars, students, and librarians. NoteExpress is designed to help you organize research notes and bibliographic references, generate bibliographies automatically, search and capture bibliographic data from Internet with efficiency and ease. NoteExpress is well integrated with Microsoft Word. It can format bibliographies in many popular styles. NoteExpress works the same as many other bibliograhic softwares but with many additional efficient features like:

    * Manage bibliographic data and notes with efficiency
    * Organize academic papers and any other files in disk together with bibliographic data and note
    * Capture data from many Internet sources
    * Multilingual style formatting supported
    * Efficiency and robust
    * Bibilographic records and notes can be categorized in different folders. What’s more, they can be multi-categorized without being duplicated.
    * NoteExpress supports many internet libraries and can import bibliographic data without tedious typing. NoteExpress imports data via Z39.50 protocol which is commonly used by most of the public libraries. With NoteExpress, users can search and import bibliography from National Library of Congress, OCLC, etc. NoteExpress also supports searches in Amazon, Jstor, Web of Science library.
    * NoteExpres imports bibliographic data with dramatic speed (0.5 milliseconds for 10000 records at a Pentium IV 1.8G, 256M computer, more than 10 times faster than any other bibliographic tools!).
    * NoteExpress is well integrated into Microsoft Word. User can cite and write without leaving Word environment. There are more than 1350 journal styles included, and is increasing.
    * NoteExpress helps you mangage your academic files in your hard disk. With NoteExpress’s attachment management module, users can add as many documents into NoteExpress library. That is, NoteExpress is not only a bibliographic manager, it’s also a scholar’s assitant.

OS: Win9x/Me/NT/2000/XP/2003

Licence: Free
http://www.reflib.org/index.htm
Download:
http://www.reflib.org/download.htm

FastPCR – an integrated tool for PCR primers design

FastPCR software is an integrated tools environment that provides comprehensive facilities for designing any kind of PCR primers for standard, long distance, inverse, real-time PCR (LUX and self-reporting), multiplex PCR, group-specific PCR (common primers for given N target sequences), unique PCR (design of specific (unique) PCR primers for each sequence); single primer PCR (design of PCR primers from close located inverted repeat), automatically detecting SSR loci and direct PCR primer design, amino acid sequence degenerate PCR and more. The software consists of a data editor; build-in commands for probe/primers design and build automation tools.

OS: Windows (Windows XP, Windows Server 2003, Windows Vista or 7;
32-Bit (x86) or 64-Bit Windows)

Licence: Free for non-commercial usage
http://primerdigital.com/

Bioinformatics

Pages

Biosoftwares

No comments:

Post a Comment