BLAST Module
Fast protein sequence similarity search using Diamond BLAST

Module Overview
The BLAST module provides fast and accurate protein sequence similarity search using Diamond BLAST algorithms. This module serves as the central hub for sequence comparison, homology detection, and functional annotation through advanced bioinformatics tools.
Key Features
- 1Select Diamond BLAST program type and database category for optimal sequence comparison
- 2Input query sequences through paste or file upload methods with specific sequence requirements
- 3Configure advanced parameters for more precise and detailed sequence alignment analysis
Program and Database Selection
Diamond Program Types
Choose the appropriate Diamond BLAST program based on your query sequence type and analysis goals.
- Diamond BLASTP: Protein query vs protein database
- Diamond BLASTX: Nucleotide query translated vs protein database
Database Categories
Select the appropriate database type for your sequence comparison analysis.
- Nucleotide Database: DNA/RNA sequences for nucleotide comparisons
- Protein Database: Amino acid sequences for protein comparisons
Sequence Input Methods
Input Options
Choose between pasting sequences directly or uploading sequence files for analysis.
Paste Sequence
- • Direct text input in FASTA format
- • Supports both single and multiple sequences
- • Real-time sequence validation
- • Immediate format checking
Upload File
- • Support for .fasta, .fa, .txt files
- • Drag and drop interface
- • Automatic file parsing
- • Batch sequence processing
Sequence Requirements
Allowed Characters:
- • Amino acids: A, C, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, Y
- • Nucleotides: A, T, C, G, U, N
- • Ambiguity codes: B, Z, X
- • Gaps: Hyphens (-) for sequence gaps
Length Requirements:
- • Minimum: 3 characters
- • Maximum: 100,000 characters
- • Format: FASTA format recommended
Invalid Characters:
- • Asterisks (*), dots (.), numbers, or special symbols
- • Spaces within sequence data
- • Non-standard amino acid codes
Advanced Parameters
Parameter Configuration
Fine-tune your BLAST search with advanced parameters for more precise and detailed analysis.
E-value Threshold
The E-value represents the expected number of chance matches. Lower values indicate more stringent searches.
Common Values:
- • 0.001: Very strict (high confidence)
- • 0.01: Strict (default for most searches)
- • 0.1: Moderate
- • 1.0: Relaxed
- • 10.0: Very relaxed
Other Parameters:
- • Matrix: Optimized for Diamond
- • Gap penalties: Default values
- • Max hits: Configurable
- • Word size: Auto-optimized
Search Tips for Optimal Results
Sequence Preparation
- Clean sequences by removing non-standard characters
- Use FASTA format for best compatibility
- Check sequence length before submission
- Verify sequence type matches program selection
Parameter Optimization
- Start with default E-value (0.01) for most searches
- Use stricter E-values for high-confidence matches
- Adjust parameters based on sequence length
- Consider database size when setting thresholds
Usage Examples
Example 1: Protein Homology Search
- Select “Diamond BLASTP” program for protein-protein comparison
- Choose “Protein Database” as the target database
- Paste your protein sequence in FASTA format
- Set E-value to 0.001 for high-confidence matches
- Click “Run Diamond Search” to start analysis
Example 2: Nucleotide Translation Search
- Select “Diamond BLASTX” for nucleotide-to-protein search
- Choose “Protein Database” as the target database
- Upload a FASTA file containing nucleotide sequences
- Use default E-value (0.01) for balanced results
- Download results in multiple formats for analysis
Available Data Types
BLAST Results
Hit alignments and scores
TSV Format
Raw Diamond output
JSON Format
Structured results
FASTA Summary
Query and hit sequences