This function removes outliers (pairs with high identity but low total aligned genome) from the results dataframe based on the specified thresholds for percent identity and alignment length.
Usage
remove_outliers(
df,
percent_identity_col,
alignment_length_col,
percent_identity_threshold = NULL,
alignment_length_threshold = NULL
)
Arguments
- df
A dataframe containing alignment results data.
- percent_identity_col
The column name for the percent identity values.
- alignment_length_col
The column name for the normalized alignment length values.
- percent_identity_threshold
The threshold for percent identity (optional).
- alignment_length_threshold
The threshold for alignment length (optional).