The problem with current spelling correction datasets is that those resources are not available for free, do not have any context, or are completely artificial. This leads me to generating a, although artificial, new freely available dataset, containing the most typical spelling errors. By the identification of the most common misspelling types we are able to not only generate a purely artificial data set, but one that is also lead by how real misspellings are introduced in documents.