package ‘genomatic’ · 2013-08-29 · package ‘genomatic’ february 15, 2013 type package...
TRANSCRIPT
Package ‘genomatic’February 15, 2013
Type Package
Title Manages microsatellite projects. Creates 96-well maps,genotyping submission forms, rerun management, and import intostatistical software.
Version 0.0-7
Date 2010-01-05
Author Brian J. Knaus
Maintainer Brian J. Knaus <[email protected]>
Depends R (>= 2.9.2), tcltk
Description Manages DNA fragment analysis projects. Creates 96-wellmaps, genotyping submission forms, rerun management, and import to statistical software.
License GPL-2
Repository CRAN
Date/Publication 2012-10-29 08:58:49
NeedsCompilation no
R topics documented:abi_sub . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2allele_char . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3allele_hist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4allele_process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4bin_by_num . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5bin_caller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6bin_init . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7bin_score . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7cat_sorter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8condense . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1
2 abi_sub
dup_checker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10dup_replace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10filename2sample . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11genobins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12genogui . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13genomatic2genalex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13genomatic2ntsys . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14genotyper2genomatic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15init_gmtc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15order_sample . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16plater . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17plate_write . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18pop2indiv . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18pop_get . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19sample_sorter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Index 21
abi_sub Creates ABI submission forms
Description
Takes a 96-well plate and converts it into a submission form for ABI genotyping.
Usage
abi_sub(plate, fset, outfile)
Arguments
plate A 96-well plate of samples
fset filter set to be used
outfile Name of the outfile (ABI submission form) to be created
Details
This function takes a 96-well plate and converts it into an ABI format submission file. The welllocation is indicated. A sample name is created by concatenating the well and the sample name.The color number is indicated. The standard dye, dye set, and color info is set based on the colorinfo input by the user. A color comment is created by concatenating the sample name (with wellnumber), the color info, and the information provided in ‘loci\_cats.’ This is output to a commadelimited file which can be directly used by the ABI software or it can be opened with a spreadsheetand saved in a proprietary format.
Value
Returns a NULL.
allele_char 3
Author(s)
Brian J. Knaus <[email protected]> http://brianknaus.com
References
Knaus, B.J. In prep. Genomatic: an R package for DNA fragment analysis project management.
allele_char Characterizes bins
Description
Records the mean, standard deviation, minimum, maximum, and count for each allele in a locus.
Usage
allele_char(allele.v, allele.df)
Arguments
allele.v All the peaks for a locus.
allele.df an allele file containing bin mixima and minima in columns 8 and 9.
Details
Characterizes bins based on allele data. Alleles in allele.v are sorted from smallest to biggestand then sorted into bins based on the min and max included in allele.df. A mean is recordedand rounded to produce an allele name. Standard deviation, min, max, count and range are alsorecorded. Complete allele files are printed to file, one for each locus.
Value
returns a data.frame.
Author(s)
Brian J. Knaus <[email protected]> http://brianknaus.com
References
Knaus, B.J. In prep. Genomatic: an R package for DNA fragment analysis project management.
4 allele_process
allele_hist Creates histograms of alleles
Description
Creates hisograms of alleles with bins in alternating colors and non-bins in black.
Usage
allele_hist(gmtc.a, allele.df, xlim=NULL)
Arguments
gmtc.a Allelic data in the form of two columns from a genomatic file.
allele.df An allele file containing bin information for the locus
xlim A vector containing the minimum and maximum values for the x axis
Details
Creates hisograms of alleles with bins in alternating colors and non-bins in black.
Value
Prints histogram to graphical device.
Author(s)
Brian J. Knaus <[email protected]> http://brianknaus.com
References
Knaus, B.J. In prep. Genomatic: an R package for DNA fragment analysis project management
allele_process Processes alleles into bins
Description
Processes alleles into bins for the entire project. Descriptive plots and text output are created.
Usage
allele_process(gmtc, allele.l)
bin_by_num 5
Arguments
gmtc a genomatic data.frame.
allele.l a list where each element contains a genomatic allele file.
Details
Processes alleles into bins for characterization. Alleles are binned by the min and max provided incolumns 8 and 9 of each allele file. Histograms are created for each locus with alternating color toindicate bins. Range by count plots are provided to find bins that are unusually wide.
Value
Returns a NULL.
Author(s)
Brian J. Knaus <[email protected]> http://brianknaus.com
References
Knaus, B.J. In prep. Genomatic: an R package for DNA fragment analysis project management.
bin_by_num Graphical output of bin quality
Description
Graphical output of bin quality. Creates plots of bin width as a function number of peaks in the bin.
Usage
bin_by_num(locus, loci, ident = 0)
Arguments
locus allele characteristics produced ‘allele\_char’
loci a data.frame of locus names
ident flag to enable interactive identification of points when set to 1. Default is no.
Details
Graphical output of bin quality. Creates plots of bin width as a function number of peaks in thebin. A heavy line is plotted at one base pair and thinner lines are plotted at 1.1, 1.2, 1.3, and 1.4base pairs. It seems intuitive that a bin should not range more than a base pair, however experiencesuggests that bins that range more than a single base pair are frequently acceptable. This could bedue to error in the size calling (how close the peak is to a size standard), among capillary error(capillary quality may induce error), or other unforeseen issues.
6 bin_caller
Value
Returns a NULL.
Author(s)
Brian J. Knaus <[email protected]> http://brianknaus.com
References
Knaus, B.J. In prep. Genomatic: an R package for DNA fragment analysis project management.
bin_caller Bins peaks
Description
Uses allele characteristics to call bins.
Usage
bin_caller(locus, locus_char, ploid=2)
Arguments
locus a data.frame containing all the peaks to be binned
locus\_char a table output from ‘allele\_char’ containing characteristics of the alleles (mean,sd, etc.)
ploid describes whether allelic data is spread over one column (haploid) or two (diploid).
Details
Uses allele characteristics to call bins. If a particular peak does not fall within any of the bins it isflagged with an ’NA.’
Value
Returns a NULL.
Author(s)
Brian J. Knaus <[email protected]> http://brianknaus.com
References
Knaus, B.J. In prep. Genomatic: an R package for DNA fragment analysis project management.
bin_init 7
bin_init Initializes allele files
Description
Initializes allele files
Usage
bin_init(samples, gap=0.3)
Arguments
samples a genomatic file where column one contains populations, column 2 containsindividuals and subsequent columns contain allelic data, two columns per locus
gap defines the gap size required to start a new bin
Details
Initializes allele files. Writes a comma delimited allele file for each locus.
Value
Returns a NULL.
Author(s)
Brian J. Knaus <[email protected]> http://brianknaus.com
References
Knaus, B.J. In prep. Genomatic: an R package for DNA fragment analysis project management.
bin_score Automates bin scoring
Description
Automates the bin calling process
Usage
bin_score(plate, loci, outfile="_scored_gmtc.csv")
8 cat_sorter
Arguments
plate a 96-well plate of samples
loci a table of allele characteristics
outfile a name for an outfile
Details
Automates bin calling, extracting population info from the names, sorting the samples and writes atext output file.
Value
Returns a NULL.
Author(s)
Brian J. Knaus <[email protected]> http://brianknaus.com
References
Knaus, B.J. In prep. Genomatic: an R package for DNA fragment analysis project management.
cat_sorter Sorts categories to their sample
Description
This function takes the output from Genotyper export tables and sorts it into elements of a list whereeach element includes only one category (locus).
Usage
cat_sorter(gdata)
Arguments
gdata A data.frame which is input from a Genotyper table export.
Details
The output from the Genotyper table export will include one category (locus) per row and willtherefore include one sample on many rows. This function reformats this file so that each sampleoccupies only a single row and has many categores (loci).
Value
Returns a data.frame of scored data.
condense 9
Author(s)
Brian J. Knaus <[email protected]> http://brianknaus.com
References
Knaus, B.J. In prep. Genomatic: an R package for DNA fragment analysis project management.
condense Merges samples with multiple non-identical records
Description
Merges samples with multiple non-identical records, only removes NAs.
Usage
condense(gmtc)
Arguments
gmtc A genomatic data.frame.
Details
When a sample has been run more than once but for different loci (multiplexes) these multiplerecords can be combined without losing data, only NAs.
Value
Returns a genomatic data.frame.
Author(s)
Brian J. Knaus <[email protected]> http://brianknaus.com
References
Knaus, B.J. In prep. Genomatic: an R package for DNA fragment analysis project management.
10 dup_replace
dup_checker Checks for duplicate samples
Description
Checks a file for duplicates. Returns a sorted file containing duplicates.
Usage
dup_checker(samples)
Arguments
samples A data.frame containing your samples.
Details
Checks a file for duplicates. Returns a sorted file containing duplicates. Duplicates are sorted sothey appear next to each other. This file can then be manually sorted to that there is only one instanceof each sample. This file is subsequently used by the function ‘dup\_replace’ to remove duplicatesand replace them with a single instance.
Value
Retruns a data.frame of samples that appear more than once.
Author(s)
Brian J. Knaus <[email protected]> http://brianknaus.com
References
Knaus, B.J. In prep. Genomatic: an R package for DNA fragment analysis project management.
dup_replace Replaces duplicate samples with a single sample
Description
Uses the manually edited file created in ‘dup\_checker’ to remove duplicate samples in a data.frame.
Usage
dup_replace(all_samps, replace_samps)
filename2sample 11
Arguments
all_samps A data.frame of your entire dataset (including duplicates).
replace_samps A data.frame of samples created by ‘dup\_checker’ and manually edited so onlya single instance of each sample appears.
Details
Uses the manually edited file created in ‘dup\_checker’ to remove duplicate samples in a data.frame.The samples names in ‘replace\_samps’ are used to remove all instances of these names from‘all\_samps.’ ‘replace\_samps’ is then added to the dataset. Finally ‘order\_sample’ is used toorder the samples alpha-numericallly.
Value
Returns a data.frame that is sorted and includes no duplicates.
Author(s)
Brian J. Knaus <[email protected]> http://brianknaus.com
References
Knaus, B.J. In prep. Genomatic: an R package for DNA fragment analysis project management.
filename2sample Extracts sample names from file names
Description
Takes the field ’File.Name’ and extracts the sample name from it.
Usage
filename2sample(file.name)
Arguments
file.name A vector of File.Names.
Details
The file name is expected to follow the format specified in the function ‘plate\_write.’ An example is‘A10\_PUTR\_068\_01\_A10\_02.fsa’ where the sample name is ‘PUTR\_068\_01.’ This functionextracts this sample name assuming the character ‘\_’ delimits fields within ‘File.Name’ and returnsa data.frame containing just the sample name.
12 genobins
Value
A data.frame containing sample names.
Author(s)
Brian J. Knaus <[email protected]> http://brianknaus.com
References
Knaus, B.J. In prep. Genomatic: an R package for DNA fragment analysis project management.
genobins Remove peaks from files and leave bins
Description
Removes peak calls from files and leaves bin calls. Prepares for output to statistical software.
Usage
genobins(samples)
Arguments
samples a data.frame of samples containing peak calls and bin calls
Details
Removes peak calls from files and leaves bin calls. Prepares for output to statistical software.
Value
Returns a data.frame.
Author(s)
Brian J. Knaus <[email protected]> http://brianknaus.com
References
Knaus, B.J. In prep. Genomatic: an R package for DNA fragment analysis project management.
genogui 13
genogui Genomatic graphical user interface
Description
Graphical user interface designed to run genomatic functions.
Usage
genogui()
Arguments
genogui takes no arguments
Details
Graphical user interface designed to run genomatic functions.
Value
Returns nothing.
Author(s)
Brian J. Knaus <[email protected]> http://brianknaus.com
References
Knaus, B.J. In prep. Genomatic: an R package for DNA fragment analysis project management.
genomatic2genalex Converts data to GenAlEx format
Description
Converts data to GenAlEx format
Usage
genomatic2genalex(data, outfile)
Arguments
data a data.frame containing your data
outfile a name for your outfile
14 genomatic2ntsys
Details
Converts data to GenAlEx format. This software is convenient because it provides output to manyother formats. Therefore, even if you’re not interested in GenAlEx you might find it to be a conve-nient conversion tool. See http://www.anu.edu.au/BoZo/GenAlEx/ for information on GenAlEx.
Value
Returns a NULL.
Author(s)
Brian J. Knaus <[email protected]> http://brianknaus.com
References
Knaus, B.J. In prep. Genomatic: an R package for DNA fragment analysis project management.
genomatic2ntsys Converts data to NTSYSpc format
Description
Converts data to the NTYSYSpc format
Usage
genomatic2ntsys(samples, outfile)
Arguments
samples a data.frame of your dataoutfile a name for your outfile
Details
Converts data to the NTSysPC format. See http://www.exetersoftware.com/cat/ntsyspc/ntsyspc.htmlfor information on NTSYSpc. NTSYSpc is commercial software.
Value
Returns a NULL.
Author(s)
Brian J. Knaus <[email protected]> http://brianknaus.com
References
Knaus, B.J. In prep. Genomatic: an R package for DNA fragment analysis project management.
genotyper2genomatic 15
genotyper2genomatic Import genotyper tables
Description
Reformats data from the Genotyper table export format to the genomatic format.
Usage
genotyper2genomatic(gmtc, plate.l)
Arguments
gmtc A genomatic data.frame.
plate.l A list where each element is a plate of samples.
Details
This function reformats data from the Genotyper table export format to the genomatic format. Itcalls the functions ‘cat\_sorter’, ‘filename2sample’, ‘pop\_get’, and ‘order\_sample’ to accomplishthis.
Value
Returns a formatted data.frame.
Author(s)
Brian J. Knaus <[email protected]> http://brianknaus.com
References
Knaus, B.J. In prep. Genomatic: an R package for DNA fragment analysis project management.
init_gmtc Initializes a genomatic data.frame
Description
Initializes a genomatic data.frame.
Usage
init_gmtc(loci)
16 order_sample
Arguments
loci A data.frame of loci used in the project
Details
This function initializes a genomatic data.frame. This format is used throughout the genomaticprocess. A genomatic data.frame is specific to diploid organisms. The first coloumn is a populationcode. The second column is an individual code. All following columns are diploid loci in pairs. Forexample, columns three and four would contain locus one alleles a and b.
Value
Returns a genomatic data.frame.
Author(s)
Brian J. Knaus <[email protected]> http://brianknaus.com
References
Knaus, B.J. In prep. Genomatic: an R package for DNA fragment analysis project management.
order_sample Order samples alphanumerically
Description
Organizes the data alpha-numerically.
Usage
order_sample(data_df)
Arguments
data_df A data.frame of data where the first column is the sample name.
Details
A data.frame of data where the first column is the sample name. The data.frame is reordered alpha-numerically based on sample name.
Value
A data.frame.
Author(s)
Brian J. Knaus <[email protected]> http://brianknaus.com
plater 17
References
Knaus, B.J. In prep. Genomatic: an R package for DNA fragment analysis project management.
plater Organizes individual names into 96-well plates
Description
Organizes individual names into 96-well plates
Usage
plater(samps)
Arguments
samps A vector of individual names
Details
Takes a vector of individual names and organizes them into a matrix in the format of 96-well plates.
Value
Returns a list where each element is a 96-well plate of samples. Each element is a matrix of samples.
Author(s)
Brian J. Knaus <[email protected]> http://brianknaus.com
References
Knaus, B.J. In prep. Genomatic: an R package for DNA fragment analysis project management.
18 pop2indiv
plate_write Write 96-well maps
Description
A function that takes a list of 96-well plates and writes each to an individual comma delimited file.
Usage
plate_write(plates, prefix)
Arguments
plates A list where each element is a matrix representation of a 96-well plate withsamples.
prefix A prefix for you outfile names.
Details
This function takes a list of 96-well plates that has been created by the function ’plater’ (or is in thesame format) and saves each 96-well plate as a comma delimited file. Comma delimited files caneasily be imported into spreadsheets for further editing.
Value
Ruturns a NULL.
Author(s)
Brian J. Knaus <[email protected]> http://brianknaus.com
References
Knaus, B.J. In prep. Genomatic: an R package for DNA fragment analysis project management.
pop2indiv Population Name to Individual Name Conversion
Description
Takes population names and the respective sample size per population (n) and expands the popula-tion names to sample names of population\_name\_1, . . ., population\_name\_n.
Usage
pop2indiv(sites)
pop_get 19
Arguments
sites data.grame containing population names and sample sizes.
Details
The object ‘samps’ should be a vector of factors containing the population names. The object ‘n’should be a vector of integers containing population sample sizes in the same population order asthe object ‘samps.’
Value
returns a vector of character strings where each element is a population name and sample number(e.g., population\_name\_1, . . ., population\_name\_n).
Author(s)
Brian J. Knaus <[email protected]> http://brianknaus.com
References
Knaus, B.J. In prep. Genomatic: an R package for DNA fragment analysis project management.
pop_get Get populations from sample names
Description
Extracts the population name from a sample’s name
Usage
pop_get(data)
Arguments
data A data.frame of sample names where the sample names are in the first column.All subsequent columns are ignored.
Details
This function takes a data.frame of sample names which follows the format specified in the func-tion ‘filename2sample’. For example, the sample name ‘PUTR\_068\_01’ is from the population‘PUTR\_068.’ A data.frame consisting of one column which contains the population names is re-turned.
Value
A data.frame consisting of one column which contains the population names.
20 sample_sorter
Author(s)
Brian J. Knaus <[email protected]> http://brianknaus.com
References
Knaus, B.J. In prep. Genomatic: an R package for DNA fragment analysis project management.
sample_sorter Sorts data from separate lists into a data.frame
Description
Sorts data from separate lists into a data.frame
Usage
sample_sorter(samp_names, locus_l)
Arguments
samp_names a vector of sample names to sort all samples with
locus_l a list where each element contains data for a different locus
Details
Takes a vector of sample names and a list where each element in the list is a data.frame containiniga different locus and organizes this into a single data.frame.
Value
Returns a data.frame of sorted data.
Author(s)
Brian J. Knaus <[email protected]> http://brianknaus.com
References
Knaus, B.J. In prep. Genomatic: an R package for DNA fragment analysis project management.
Index
∗Topic manipabi_sub, 2allele_char, 3allele_hist, 4allele_process, 4bin_by_num, 5bin_caller, 6bin_init, 7bin_score, 7cat_sorter, 8condense, 9dup_checker, 10dup_replace, 10filename2sample, 11genobins, 12genogui, 13genomatic2genalex, 13genomatic2ntsys, 14genotyper2genomatic, 15init_gmtc, 15order_sample, 16plate_write, 18plater, 17pop2indiv, 18pop_get, 19sample_sorter, 20
pop_get (pop_get), 19
abi_sub, 2allele_char, 3allele_hist, 4allele_process, 4
bin_by_num, 5bin_caller, 6bin_init, 7bin_score, 7
cat_sorter, 8
condense, 9
dup_checker, 10dup_replace, 10
filename2sample, 11
genobins, 12genogui, 13genomatic2genalex, 13genomatic2ntsys, 14genotyper2genomatic, 15
init_gmtc, 15
order_sample, 16
plate_write, 18plater, 17pop2indiv, 18pop_get, 19
sample_sorter, 20
21