public class Editions extends Object
Modifier and Type | Class and Description |
---|---|
static class |
Editions.Chunk |
Modifier and Type | Field and Description |
---|---|
protected boolean |
closed |
static int |
DELETION |
protected double |
delta |
double |
distance
Levenshtein's distance between vs1 and vs2.
|
protected int[][] |
editions
In the form [length][3]
|
static int |
INSERTION |
static int |
MUTATION |
protected VectorString |
vs1 |
protected VectorString |
vs2 |
protected double |
WD
Weight for deletion cost.
|
protected double |
WI
Weight for insertion cost.
|
protected double |
WM
Weight for correspondence cost.
|
Constructor and Description |
---|
Editions(VectorString vs1,
VectorString vs2,
double delta,
boolean closed) |
Editions(VectorString vs1,
VectorString vs2,
double delta,
boolean closed,
double wi,
double wd,
double wm) |
Modifier and Type | Method and Description |
---|---|
Editions.Chunk |
findLargestMutationChunk(int max_non_mut) |
double |
getDistance() |
int[][] |
getEditions() |
double |
getPhysicalDistance(boolean skip_ends,
int max_mut,
float min_chunk,
boolean average)
Returns the distance between all points involved in a mutation; if average is false, then it returns the cummulative.
|
double |
getSimilarity() |
double |
getSimilarity(boolean skip_ends,
int max_mut,
float min_chunk)
A mutation is considered an equal or near equal, and thus does not count.
|
double |
getSimilarity2() |
double |
getSimilarity2(boolean skip_ends,
int max_mut,
float min_chunk)
Returns the number of mutations / max(len(vs1), len(vs2)) : 1.0 means all are mutations and the sequences have the same lengths.
|
double[] |
getStatistics(boolean skip_ends,
int max_mut,
float min_chunk,
boolean score_mut_only)
Returns {average distance, cummulative distance, stdDev, median, prop_mut} which are:
[0] - average distance: the average physical distance between mutation pairs
[1] - cummulative distance: the sum of the distances between mutation pairs
[2] - stdDev: of the physical distances between mutation pairs relative to the average
[3] - median: the average medial physical distance between mutation pairs, more robust than the average to extreme values
[4] - prop_mut: the proportion of mutation pairs relative to the length of the queried sequence vs1.
|
double |
getStdDev(boolean skip_ends,
int max_mut,
float min_chunk) |
VectorString |
getVS1() |
VectorString |
getVS2() |
int |
length() |
String |
prettyPrint(String separator)
Get the sequence of editions and matches in three lines, like:
vs1: 1 2 3 4 5 6 7 8 9
M M D M M M I I M M M
vs2: 1 2 3 4 5 6 7 8 9
With the given separator (defaults to tab if null)
|
Editions |
recreateFromCenter(int max_non_mut)
Find the longest chunk of mutations (which can include chunks of up to max_non_mut of non-mutations),
then take the center point and split both vector strings there, perform matching towards the ends,
and assemble a new Editions object.
|
public static final int DELETION
public static final int INSERTION
public static final int MUTATION
protected final double WI
protected final double WD
protected final double WM
protected VectorString vs1
protected VectorString vs2
protected double delta
protected boolean closed
protected int[][] editions
public double distance
public Editions(VectorString vs1, VectorString vs2, double delta, boolean closed)
public Editions(VectorString vs1, VectorString vs2, double delta, boolean closed, double wi, double wd, double wm)
public double getDistance()
public int length()
public int[][] getEditions()
public VectorString getVS1()
public VectorString getVS2()
public double getSimilarity(boolean skip_ends, int max_mut, float min_chunk)
skip_ends
- enables ignoring sequences in the beginning and ending if they are insertions or deletions.max_mut
- indicates the maximum length of a contiguous sequence of mutations to be ignored when skipping insertions and deletions at beginning and end.min_chunk
- indicates the minimal proportion of the string that should remain between the found start and end, for vs1. The function will return the regular similarity if the chunk is too small.public double getSimilarity()
public double getSimilarity2()
public double getSimilarity2(boolean skip_ends, int max_mut, float min_chunk)
public double getPhysicalDistance(boolean skip_ends, int max_mut, float min_chunk, boolean average)
public double getStdDev(boolean skip_ends, int max_mut, float min_chunk)
public double[] getStatistics(boolean skip_ends, int max_mut, float min_chunk, boolean score_mut_only)
public String prettyPrint(String separator)
public Editions.Chunk findLargestMutationChunk(int max_non_mut)
public Editions recreateFromCenter(int max_non_mut) throws Exception
Exception
Copyright © 2015–2021 Fiji. All rights reserved.