Bio::AlignIO::stockholm.3pm

Langue: en

Version: 2010-05-19 (ubuntu - 24/10/10)

Section: 3 (Bibliothèques de fonctions)

NAME

Bio::AlignIO::stockholm - stockholm sequence input/output stream

SYNOPSIS

   # Do not use this module directly.  Use it via the L<Bio::AlignIO> class.
 
   use Bio::AlignIO;
   use strict;
 
   my $in = Bio::AlignIO->new(-format => 'stockholm',
                              -file   => 't/data/testaln.stockholm');
   while( my $aln = $in->next_aln ) {
 
   }
 
 

DESCRIPTION

This object can transform Bio::Align::AlignI objects to and from stockholm flat file databases. This has been completely refactored from the original stockholm parser to handle annotation data and now includes a write_aln() method for (almost) complete stockholm format output.

Stockholm alignment records normally contain additional sequence-based and alignment-based annotation

   GF Lines (alignment feature/annotation):
   #=GF <featurename> <Generic per-file annotation, free text>
   Placed above the alignment
 
   GC Lines (Alignment consensus)
   #=GC <featurename> <Generic per-column annotation, exactly 1
        character per column>
   Placed below the alignment
 
   GS Lines (Sequence annotations)
   #=GS <seqname> <featurename> <Generic per-sequence annotation, free
        text>
 
   GR Lines (Sequence meta data)
   #=GR <seqname> <featurename> <Generic per-sequence AND per-column
        mark up, exactly 1 character per column>
 
 

Currently, sequence annotations (those designated with GS tags) are parsed only for accession numbers and descriptions. It is intended that full parsing will be added at some point in the near future along with a builder option for optionally parsing alignment annotation and meta data.

The following methods/tags are currently used for storing and writing the alignment annotation data.

     Tag        SimpleAlign
                  Method  
     ----------------------------------------------------------------------
      AC        accession  
      ID        id  
      DE        description
     ----------------------------------------------------------------------
 
     Tag        Bio::Annotation   TagName                    Parameters
                Class
     ----------------------------------------------------------------------
      AU        SimpleValue       record_authors             value
      SE        SimpleValue       seed_source                value
      GA        SimpleValue       gathering_threshold        value
      NC        SimpleValue       noise_cutoff               value
      TC        SimpleValue       trusted_cutoff             value
      TP        SimpleValue       entry_type                 value
      SQ        SimpleValue       num_sequences              value
      PI        SimpleValue       previous_ids               value
      DC        Comment           database_comment           comment
      CC        Comment           alignment_comment          comment
      DR        Target            dblink                     database
                                                             primary_id
                                                             comment
      AM        SimpleValue       build_method               value
      NE        SimpleValue       pfam_family_accession      value
      NL        SimpleValue       sequence_start_stop        value
      SS        SimpleValue       sec_structure_source       value
      BM        SimpleValue       build_model                value
      RN        Reference         reference                  *
      RC        Reference         reference                  comment
      RM        Reference         reference                  pubmed
      RT        Reference         reference                  title
      RA        Reference         reference                  authors
      RL        Reference         reference                  location
     ----------------------------------------------------------------------
   * RN is generated based on the number of Bio::Annotation::Reference objects
 
 

Custom annotation

Some users may want to add custom annotation beyond those mapped above. Currently there are two methods to do so; however, the methods used for adding such annotation may change in the future, particularly if alignment Writer classes are introduced. In particular, do not rely on changing the global variables @WRITEORDER or %WRITEMAP as these may be made private at some point.

1) Use (and abuse) the 'custom' tag. The tagname for the object can differ from the tagname used to store the object in the AnnotationCollection.

     # AnnotationCollection from the SimpleAlign object
     my $coll = $aln->annotation; 
     my $factory = Bio::Annotation::AnnotationFactory->new(-type => 
         Bio::Annotation::SimpleValue');
     my $rfann = $factory->create_object(-value => $str, 
                                         -tagname => 'mytag');
     $coll->add_Annotation('custom', $rfann);
     $rfann = $factory->create_object(-value => 'foo',
                                     -tagname => 'bar');
     $coll->add_Annotation('custom', $rfann);
 
 

OUTPUT:

     # STOCKHOLM 1.0
     
     #=GF ID myID12345
     #=GF mytag katnayygqelggvnhdyddlakfyfgaglealdffnnkeaaakiinwvaEDTTRGKIQDLV??
     #=GF mytag TPtd~????LDPETQALLV???????????????????????NAIYFKGRWE?????????~??
     #=GF mytag ??HEF?A?EMDTKPY??DFQH?TNen?????GRI??????V???KVAM??MF?????????N??
     #=GF mytag ???DD?VFGYAEL????DE???????L??D??????A??TALELAY??????????????????
     #=GF mytag ?????????????KG??????Sa???TSMLILLP???????????????D??????????????
     #=GF mytag ???????????EGTr?????AGLGKLLQ??QL????????SREef??DLNK??L???AH????R
     #=GF mytag ????????????L????????????????????????????????????????R?????????R
     #=GF mytag ??QQ???????V???????AVRLPKFSFefefdlkeplknlgmhqafdpnsdvfklmdqavlvi
     #=GF mytag gdlqhayafkvd????????????????????????????????????????????????????
     #=GF mytag ????????????????????????????????????????????????????????????????
     #=GF mytag ????????????????????????????????????????????????????????????????
     #=GF mytag ????????????????????????????????????????????????????????????????
     #=GF mytag ?????????????INVDEAG?TEAAAATAAKFVPLSLppkt??????????????????PIEFV
     #=GF mytag ADRPFAFAIR??????E?PAT?G????SILFIGHVEDPTP?msv?
     #=GF bar foo
     ...
 
 

2) Modify the global @WRITEORDER and %WRITEMAP.

     # AnnotationCollection from the SimpleAlign object
     my $coll = $aln->annotation;
     
     # add to WRITEORDER
     my @order = @Bio::AlignIO::stockholm::WRITEORDER;
     push @order, 'my_stuff';
     @Bio::AlignIO::stockholm::WRITEORDER = @order;
     
     # make sure new tag maps to something
     $Bio::AlignIO::stockholm::WRITEMAP{my_stuff} = 'Hobbit/SimpleValue';
 
     my $rfann = $factory->create_object(-value => 'Frodo',
                                         -tagname => 'Hobbit');
     $coll->add_Annotation('my_stuff', $rfann);
     $rfann = $factory->create_object(-value => 'Bilbo',
                                      -tagname => 'Hobbit');
     $coll->add_Annotation('my_stuff', $rfann);
 
 

OUTPUT:

     # STOCKHOLM 1.0
     
     #=GF ID myID12345
     #=GF Hobbit Frodo
     #=GF Hobbit Bilbo
     ....
 
 

FEEDBACK

Support

Please direct usage questions or support issues to the mailing list:

bioperl-l@bioperl.org

rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address it. Please include a thorough description of the problem with code and data examples if at all possible.

Reporting Bugs

Report bugs to the Bioperl bug tracking system to help us keep track the bugs and their resolution. Bug reports can be submitted via the web:
   http://bugzilla.open-bio.org/
 
 

AUTHORS - Chris Fields, Peter Schattner

Email: cjfields-at-uiuc-dot-edu, schattner@alum.mit.edu

CONTRIBUTORS

Andreas Kahari, ak-at-ebi.ac.uk Jason Stajich, jason-at-bioperl.org

APPENDIX

The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _

new

  Title   : new
  Usage   : my $alignio = Bio::AlignIO->new(-format => 'phylip'
                                           -file   => '>file');
  Function: Initialize a new L<Bio::AlignIO::phylip> reader or writer
  Returns : L<Bio::AlignIO> object
  Args    : -line_length :  length of the line for the alignment block
            -alphabet    :  symbol alphabet to set the sequences to.  If not set,
                            the parser will try to guess based on the alignment
                            accession (if present), defaulting to 'dna'.
            -spaces      :  (optional, def = 1) boolean to add a space in between
                            the "# STOCKHOLM 1.0" header and the annotation and
                            the annotation and the alignment.
 
 

next_aln

  Title   : next_aln
  Usage   : $aln = $stream->next_aln()
  Function: returns the next alignment in the stream.
  Returns : L<Bio::Align::AlignI> object
  Args    : NONE
 
 

write_aln

  Title   : write_aln
  Usage   : $stream->write_aln(@aln)
  Function: writes the $aln object into the stream in stockholm format
  Returns : 1 for success and 0 for error
  Args    : L<Bio::Align::AlignI> object
 
 

line_length

  Title   : line_length
  Usage   : $obj->line_length($newval)
  Function: Set the alignment output line length
  Returns : value of line_length
  Args    : newvalue (optional)
 
 

alphabet

  Title   : alphabet
  Usage   : $obj->alphabet('dna')
  Function: Set the sequence data alphabet
  Returns : sequence data type
  Args    : newvalue (optional)
 
 

spaces

  Title   : spaces
  Usage   : $obj->spaces(1)
  Function: Set the 'spaces' flag, which prints extra newlines between the
            header and the annotation and the annotation and the alignment
  Returns : sequence data type
  Args    : newvalue (optional)
 
 

alignhandler

  Title   : alignhandler
  Usage   : $stream->alignhandler($handler)
  Function: Get/Set the Bio::HandlerBaseI object
  Returns : Bio::HandlerBaseI 
  Args    : Bio::HandlerBaseI