BLAST Module

Fast protein sequence similarity search using Diamond BLAST

Module Overview

The BLAST module provides fast and accurate protein sequence similarity search using Diamond BLAST algorithms. This module serves as the central hub for sequence comparison, homology detection, and functional annotation through advanced bioinformatics tools.

Key Features

1Select Diamond BLAST program type and database category for optimal sequence comparison
2Input query sequences through paste or file upload methods with specific sequence requirements
3Configure advanced parameters for more precise and detailed sequence alignment analysis

Program and Database Selection

Diamond Program Types

Choose the appropriate Diamond BLAST program based on your query sequence type and analysis goals.

Diamond BLASTP: Protein query vs protein database
Diamond BLASTX: Nucleotide query translated vs protein database

Database Categories

Select the appropriate database type for your sequence comparison analysis.

Nucleotide Database: DNA/RNA sequences for nucleotide comparisons
Protein Database: Amino acid sequences for protein comparisons

Sequence Input Methods

Input Options

Choose between pasting sequences directly or uploading sequence files for analysis.

Paste Sequence

• Direct text input in FASTA format
• Supports both single and multiple sequences
• Real-time sequence validation
• Immediate format checking

Upload File

• Support for .fasta, .fa, .txt files
• Drag and drop interface
• Automatic file parsing
• Batch sequence processing

Sequence Requirements

Allowed Characters:

• Amino acids: A, C, D, E, F, G, H, I, K, L, M, N, P, Q, R, S, T, V, W, Y
• Nucleotides: A, T, C, G, U, N
• Ambiguity codes: B, Z, X
• Gaps: Hyphens (-) for sequence gaps

Length Requirements:

• Minimum: 3 characters
• Maximum: 100,000 characters
• Format: FASTA format recommended

Invalid Characters:

• Asterisks (*), dots (.), numbers, or special symbols
• Spaces within sequence data
• Non-standard amino acid codes

Advanced Parameters

Parameter Configuration

Fine-tune your BLAST search with advanced parameters for more precise and detailed analysis.

E-value Threshold

The E-value represents the expected number of chance matches. Lower values indicate more stringent searches.

Common Values:

• 0.001: Very strict (high confidence)
• 0.01: Strict (default for most searches)
• 0.1: Moderate
• 1.0: Relaxed
• 10.0: Very relaxed

Other Parameters:

• Matrix: Optimized for Diamond
• Gap penalties: Default values
• Max hits: Configurable
• Word size: Auto-optimized

Search Tips for Optimal Results

Sequence Preparation

Clean sequences by removing non-standard characters
Use FASTA format for best compatibility
Check sequence length before submission
Verify sequence type matches program selection

Parameter Optimization

Start with default E-value (0.01) for most searches
Use stricter E-values for high-confidence matches
Adjust parameters based on sequence length
Consider database size when setting thresholds

Usage Examples

Example 1: Protein Homology Search

Select “Diamond BLASTP” program for protein-protein comparison
Choose “Protein Database” as the target database
Paste your protein sequence in FASTA format
Set E-value to 0.001 for high-confidence matches
Click “Run Diamond Search” to start analysis

Example 2: Nucleotide Translation Search

Select “Diamond BLASTX” for nucleotide-to-protein search
Choose “Protein Database” as the target database
Upload a FASTA file containing nucleotide sequences
Use default E-value (0.01) for balanced results
Download results in multiple formats for analysis

Available Data Types

BLAST Results

Hit alignments and scores

TSV Format

Raw Diamond output

JSON Format

Structured results

FASTA Summary

Query and hit sequences

← Back to Help Center Try BLAST Module →