Hatena::Groupbioruby

"aac".translate #=> "N" このページをアンテナに追加 RSSフィード

2007-04-06BioSQL on ActiveRecord

Bio::BioSQL access module by ActiveRecord

|  Bio::BioSQL access module by ActiveRecord - "aac".translate #=> "N" を含むブックマーク はてなブックマーク -  Bio::BioSQL access module by ActiveRecord - "aac".translate #=> "N"  Bio::BioSQL access module by ActiveRecord - "aac".translate #=> "N" のブックマークコメント

On the BioRuby mailing list, RJP released a new implementation of BioSQL class powered by ActiveRecord.

screenshot screenshot screenshot


Preperation of BioSQL database

Download BioSQL schema.

screenshot


Install BioSQL schema.
% mysqladmin -u root -p create biosql_tx_test
% mysqladmin -u root -p create biosql_sp_test
% mysql -u root -p biosql_tx_test < sql/biosqldb-mysql.sql
% mysql -u root -p biosql_sp_test < sql/biosqldb-mysql.sql

I prepared two mysql databases, biosql_tx_test database for NCBI Taxonomy data and biosql_sp_test database for SwissProt data.


Install NCBI Taxonomy data

On the biosql-schema directory,

% mkdir taxdata
% cd taxdata
% lftp ftp.ncbi.nlm.nih.gov/pub/taxonomy
cd ok, cwd=/pub/taxonomy                                                
lftp ftp.ncbi.nlm.nih.gov:/pub/taxonomy> get taxdump.tar.gz
lftp ftp.ncbi.nlm.nih.gov:/pub/taxonomy> exit
% tar zxvf taxdump.tar.gz
% ls
citations.dmp  delnodes.dmp  division.dmp  gc.prt  gencode.dmp  
merged.dmp  names.dmp  nodes.dmp  readme.txt  taxdump.tar.gz

and invoke load_ncbi_taxonomy.pl script.

% cd ..
% perl scripts/load_ncbi_taxonomy.pl --dbname biosql_tx_test --dbuser root

screenshot

Accessing NCBI Taxonomy data using ARBioSQL

Connecting the database.

% irb -r bio -r arbiosql.rb
irb(main):001:0> con = Bio::BioSQL.new('mysql', 'biosql_tx_db', 'root')
irb(main):002:0> Bio::BioSQL::Taxon.find(1)
=> #<Bio::BioSQL::Taxon:0x2933648 @attributes={"genetic_code"=>"1", "node_rank"=>"no rank", "right_value"=>"712768", "left_value"=>"1", "taxon_id"=>"1", "ncbi_taxon_id"=>"1", "mito_genetic_code"=>"0", "parent_taxon_id"=>"1"}>

Bio::BioSQL provides connection establishment method to BioSQL database ('mysql') and ActiveRecord model classes (cf. Bio::BioSQL::Taxon).

Haw many Taxon entries ?

irb(main):003:0> Bio::BioSQL::Taxon.count
=> 356384

human Taxonomy ID ?

irb(main):019:0> Bio::BioSQL::TaxonName.find_by_name("human")
=> #<Bio::BioSQL::TaxonName:0x29449c0 @attributes={"name"=>"human", "name_class"=>"genbank common name", "taxon_id"=>"9606"}>
irb(main):020:0> Bio::BioSQL::TaxonName.find_by_name("human").taxon.id
=> 9606
irb(main):021:0> Bio::BioSQL::TaxonName.find_by_name("human").taxon_id
=> 9606

TaxonName belongs_to Taxon.

Although Bio::BioSQL::Taxon have "parent_taxon_id" field, Bio::BioSQL::Taxon#parent_taxon (#=> Bio::BioSQL::Taxon) method is not implemented.

The definition of Bio::BioSQL::Taxon class.

class Taxon < ActiveRecord::Base
  set_table_name "taxon"
  set_primary_key "taxon_id"
  set_sequence_name "taxon_pk_seq"
  has_many :taxon_name #probably has_one
  has_one :bioentry
end

Following codes support to retrieve the prent Taxon by Taxon#parent method,

  belongs_to :parent, :foreign_key => 'parent_taxon_id', :class_name => 'Taxon'

or

  def parent
    self.class.find(self.parent_taxon_id)
  end

Both codes work compatible.

irb(main):002:0> tax = Bio::BioSQL::Taxon.find(10)
=> #<Bio::BioSQL::Taxon:0x2949d30 @attributes={"genetic_code"=>"11", "node_rank"=>"genus", "right_value"=>"66309", "left_value"=>"66272", "taxon_id"=>"10", "ncbi_taxon_id"=>"10", "mito_genetic_code"=>"0", "parent_taxon_id"=>"135621"}>
irb(main):003:0> tax.parent
=> #<Bio::BioSQL::Taxon:0x2947cb0 @attributes={"genetic_code"=>"11", "node_rank"=>"family", "right_value"=>"73406", "left_value"=>"66271", "taxon_id"=>"135621", "ncbi_taxon_id"=>"135621", "mito_genetic_code"=>"0", "parent_taxon_id"=>"72274"}>

Install SwissProt data.

Use bioperl-db/scripts/biosql/load_seqdatabase.pl.

perl load_seqdatabase.pl -dbuser root -dbname biosql_sp_test -namespace swissprot -format swiss sprot40.dat

screenshot screenshot

Accessing UniProt/SwissProt data using ARBioSQL


トラックバック - http://bioruby.g.hatena.ne.jp/nakao_mitsuteru/20070406