Retrieve FASTA

Given an Accession Number, this program retrieves the sequence from NCBI GenBank in FASTA format. This script was originally written for the AluHunter project and requires BioPerl. Download script here and call with:

perl retrieveFasta_CMB_1_0.pl {accession_number} > output.fasta

This command, for example, creates a FASTA file in the same directory containing a Neandertal mitochondrial genome:

perl retrieveFasta_CMB_1_0.pl FM865408.1 > neandertal_mt.fasta

Here's the code in its entirety, for reference:

#! /usr/bin/perl

use Bio::DB::GenBank;
use Bio::DB::Query::GenBank;

my $accession = shift;
		
$query = $accession . "[ACCN]";
		
$query_obj = Bio::DB::Query::GenBank->new(-db => 'nucleotide',
                                          -query => $query );
 
$gb_obj = Bio::DB::GenBank->new;
 
$stream_obj = $gb_obj->get_Stream_by_query($query_obj);
 
while ($seq_obj = $stream_obj->next_seq) {    

	print ">gi|";
	print $seq_obj->primary_id;
	print "|gb|";
	print $accession . "| ";
	print $seq_obj->desc;
	print "\n";
	print $seq_obj->seq;
	print "\n\n";

}