RANSAC
From Biocrawler, the free encyclopedia.
RANSAC is an abbreviation for "RANdom SAmple Consensus". It is an algorithm to estimate parameters in a mathematical model from a data set when the data set contains many outliers.
| Contents |
Overview
The input to the RANSAC algorithm is a set of data values, a model to judge the agreement between data, and some confidence parameters.
RANSAC achieves its goal by iteratively selecting a random subset of the original data values. For this set the model is "fit" to the value - that is, all free parameters of the model are reconstructed from the set. All other data values are tested against the fitted model - that is, for every data value of this test set, the algorithm determined how "well" it fits into the fitted model.
Algorithm
The generic RANSAC algorithm works as follows:
Given:
input - set of input data
model - model that can test and can be fit
n - minimum number of data values required to fit the model
k - the number of iterations required
t - threshold value for a positive single data element fit
d - the number of close data values required to assert a model fits well
Return:
bestfit - set of input data that fits the model best
iterations = 0
goodfits = []
while iterations < k
randomly select n data values from input, call that set "fitset"
fit model to "fitset" and call that model "fitmodel"
closepoints = []
for all data values outsideval in input that are not in "fitset"
if error of outsideval in regard to fitmodel is smaller than t
closepoints = closepoints + [outsideval]
if the number of elements in closepoints is equal to or larger than d
refit the fitmodel using all points in closepoints and fitset
goodfits = goodfits + [fitmodel]
return the best fitted model in goodfits as bestfit or none if there is none
While the parameter values of t and d has to be calculated from the individual requirements it can be experimentally determined. The interesting parameter of the RANSAC algorithm is k.
To calculate the parameter k given the known probability w of a good data value, the probability z of seeing only bad data values is used:
which leads to
To gain additional confidence, the standard deviation or multiples thereof can be added to k. The standard deviation of k is defined as
A common case is that w is not well known beforehand, but some rough value can be given. If n data values are given, the probability of success is wn.
Applications
The RANSAC algorithm is often used in Computer Vision problems to discard a small percentage of outliers from a sample data set to improve results using later algorithms. A simple example is line fitting to a set of 2-dimensional points.
References
- David A. Forsyth, Jean Ponce (2003). Computer Vision, a modern approach. Prentice Hall. ISBN 0130851981

