BLAST Module
Fast protein sequence similarity search using Diamond BLAST

Module Overview
The BLAST module provides fast and accurate protein sequence similarity search using Diamond BLAST algorithms. This module serves as the central hub for sequence comparison, homology detection, and functional annotation through advanced bioinformatics tools.
Key Features
- 1Select Diamond BLAST program type and database category for optimal sequence comparison
 - 2Input query sequences through paste or file upload methods with specific sequence requirements
 - 3Configure advanced parameters for more precise and detailed sequence alignment analysis
 
Program and Database Selection
Diamond Program Types
Choose the appropriate Diamond BLAST program based on your query sequence type and analysis goals.
- Diamond BLASTP: Protein query vs protein database
 - Diamond BLASTX: Nucleotide query translated vs protein database
 
Database Categories
Select the appropriate database type for your sequence comparison analysis.
- Nucleotide Database: DNA/RNA sequences for nucleotide comparisons
 - Protein Database: Amino acid sequences for protein comparisons
 
Sequence Input Methods
Input Options
Choose between pasting sequences directly or uploading sequence files for analysis.
Paste Sequence
- • Direct text input in FASTA format
 - • Supports both single and multiple sequences
 - • Real-time sequence validation
 - • Immediate format checking
 
Upload File
- • Support for .fasta, .fa, .txt files
 - • Drag and drop interface
 - • Automatic file parsing
 - • Batch sequence processing
 
Sequence Requirements
Allowed Characters:
- • Amino acids: A, C, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, Y
 - • Nucleotides: A, T, C, G, U, N
 - • Ambiguity codes: B, Z, X
 - • Gaps: Hyphens (-) for sequence gaps
 
Length Requirements:
- • Minimum: 3 characters
 - • Maximum: 100,000 characters
 - • Format: FASTA format recommended
 
Invalid Characters:
- • Asterisks (*), dots (.), numbers, or special symbols
 - • Spaces within sequence data
 - • Non-standard amino acid codes
 
Advanced Parameters
Parameter Configuration
Fine-tune your BLAST search with advanced parameters for more precise and detailed analysis.
E-value Threshold
The E-value represents the expected number of chance matches. Lower values indicate more stringent searches.
Common Values:
- • 0.001: Very strict (high confidence)
 - • 0.01: Strict (default for most searches)
 - • 0.1: Moderate
 - • 1.0: Relaxed
 - • 10.0: Very relaxed
 
Other Parameters:
- • Matrix: Optimized for Diamond
 - • Gap penalties: Default values
 - • Max hits: Configurable
 - • Word size: Auto-optimized
 
Search Tips for Optimal Results
Sequence Preparation
- Clean sequences by removing non-standard characters
 - Use FASTA format for best compatibility
 - Check sequence length before submission
 - Verify sequence type matches program selection
 
Parameter Optimization
- Start with default E-value (0.01) for most searches
 - Use stricter E-values for high-confidence matches
 - Adjust parameters based on sequence length
 - Consider database size when setting thresholds
 
Usage Examples
Example 1: Protein Homology Search
- Select “Diamond BLASTP” program for protein-protein comparison
 - Choose “Protein Database” as the target database
 - Paste your protein sequence in FASTA format
 - Set E-value to 0.001 for high-confidence matches
 - Click “Run Diamond Search” to start analysis
 
Example 2: Nucleotide Translation Search
- Select “Diamond BLASTX” for nucleotide-to-protein search
 - Choose “Protein Database” as the target database
 - Upload a FASTA file containing nucleotide sequences
 - Use default E-value (0.01) for balanced results
 - Download results in multiple formats for analysis
 
Available Data Types
BLAST Results
Hit alignments and scores
TSV Format
Raw Diamond output
JSON Format
Structured results
FASTA Summary
Query and hit sequences