Function to obtain accuracy parameters: correlation coefficient, P-value and RMSE of imputation model

imirage.cv(train_pcg, train_mir, gene_index, num = 50, method = "KNN",
  folds = 10, target = "none", ...)

Arguments

train_pcg

training protein coding dataset. a numeric matrix with row names indicating samples, anSed column names indicating protein coding gene IDs.

train_mir

training miRNA expression dataset. a numeric matrix with row names indicating samples, and column names indicating miRNA IDs

gene_index

either gene name (character) or index (column number) of miRNA to be imputed.

num

number of informative protein coding genes to be used in constructing imputation model. Default is 50 genes.

method

method for imputation, either "RF" for random forests, "KNN" for K-nearest neighbor or "SVM" for support vector machines.

folds

number specifying folds (k) of cross validation to obtain imputation accuracy. Default is k=10.

target

"none" (default), "ts.pairs", or dataframe/matrix/list. this argument accepts character strings to indicate the use of all candidate genes as predictors ("none), or use built-in TargetScan miRNA-gene pairs ("ts.pairs"). also accepts a dataframe , matrix or list object containing a column with names of miRNA and a column with the names of target genes.

...

optional parameters that can be passed on to the machine-learning functions RF (randomForest), KNN (knn.reg) or SVM(svm)

Value

a matrix with three values corresponding to Spearman's correlation coefficient, P-value of the fit and root mean squared error (RMSE).

Examples

data(iMIRAGE.datasets) imirage.cv(GA.pcg, GA.mir, gene_index="hsa-let-7c", method="KNN", num=50)
#> #> Running 10-folds cross-validation... #> Iteration 1 #> Iteration 2 #> Iteration 3 #> Iteration 4 #> Iteration 5 #> Iteration 6 #> Iteration 7 #> Iteration 8 #> Iteration 9 #> Iteration 10 #> Cross-validation complete
#> PCC P-Value RMSE #> [1,] 0.7104839 1.348813e-05 1860.9367 #> [2,] 0.7296123 3.452757e-06 1748.7003 #> [3,] 0.7780488 6.065440e-08 1898.8891 #> [4,] 0.9039841 1.032795e-10 2035.1897 #> [5,] 0.7355742 1.534201e-06 1679.2552 #> [6,] 0.7283422 2.656697e-06 1576.6199 #> [7,] 0.5344575 1.895061e-03 2438.9442 #> [8,] 0.5957086 4.093539e-03 1069.9556 #> [9,] 0.6751337 2.648629e-05 1818.1154 #> [10,] 0.6792510 1.277875e-06 941.1613
imirage.cv(GA.pcg, GA.mir, gene_index=25, method="KNN", num=50)
#> #> Running 10-folds cross-validation... #> Iteration 1 #> Iteration 2 #> Iteration 3 #> Iteration 4 #> Iteration 5 #> Iteration 6 #> Iteration 7 #> Iteration 8 #> Iteration 9 #> Iteration 10 #> Cross-validation complete
#> PCC P-Value RMSE #> [1,] 0.7157895 9.499217e-07 371.1923 #> [2,] 0.7879297 5.444091e-07 519.1403 #> [3,] 0.5757809 1.924680e-04 832.8063 #> [4,] 0.4890110 1.042596e-02 265.2230 #> [5,] 0.6843854 1.032461e-06 548.5190 #> [6,] 0.7036707 2.413066e-05 383.0526 #> [7,] 0.8089296 0.000000e+00 529.3712 #> [8,] 0.8061538 2.942123e-06 734.5251 #> [9,] 0.6521261 7.776225e-05 532.4604 #> [10,] 0.5132693 1.575642e-02 317.4095