Rollup
This commit is contained in:
1
.gitignore
vendored
1
.gitignore
vendored
@@ -2,3 +2,4 @@ old/
|
|||||||
manual.odt
|
manual.odt
|
||||||
mwe
|
mwe
|
||||||
centos-upgrade-plan.txt
|
centos-upgrade-plan.txt
|
||||||
|
workflow/out/
|
||||||
|
|||||||
@@ -34,6 +34,7 @@ Insert a general description of Q-HTCP and the Q-HTCP process here.
|
|||||||
* [py_gtf_concat](#pygtfconcat)
|
* [py_gtf_concat](#pygtfconcat)
|
||||||
* [r_compile_gtf](#rcompilegtf)
|
* [r_compile_gtf](#rcompilegtf)
|
||||||
* [get_studies](#getstudies)
|
* [get_studies](#getstudies)
|
||||||
|
* [choose_easy_results](#chooseeasyresults)
|
||||||
|
|
||||||
## Notes
|
## Notes
|
||||||
|
|
||||||
@@ -94,7 +95,7 @@ Insert a general description of Q-HTCP and the Q-HTCP process here.
|
|||||||
|
|
||||||
### parse_input
|
### parse_input
|
||||||
|
|
||||||
`--project`, `--module`, `--nomodule`, and `--submodule` can be passed multiple times or with a comma-separated string
|
`--project`, `--module`, `--nomodule`, and `--wrapper` can be passed multiple times or with a comma-separated string
|
||||||
|
|
||||||
#### Options
|
#### Options
|
||||||
|
|
||||||
@@ -106,9 +107,9 @@ Insert a general description of Q-HTCP and the Q-HTCP process here.
|
|||||||
|
|
||||||
One or more modules to run (default: all), can be passed multiple times or with a comma-separated string
|
One or more modules to run (default: all), can be passed multiple times or with a comma-separated string
|
||||||
|
|
||||||
* **-s\<value\>** | **--submodule=\<value\>**
|
* **-w\<value\>** | **--wrapper=\<value\>**
|
||||||
|
|
||||||
Requires two arguments: the name of the submodule and its arguments, can be passed multiple times
|
Requires two arguments: the name of the wrapper and its arguments, can be passed multiple times
|
||||||
|
|
||||||
* **-n\<value\>** | **--nomodule=\<value\>**
|
* **-n\<value\>** | **--nomodule=\<value\>**
|
||||||
|
|
||||||
@@ -134,7 +135,7 @@ Insert a general description of Q-HTCP and the Q-HTCP process here.
|
|||||||
|
|
||||||
* **PROJECTS** (array): List of projects to cycle through
|
* **PROJECTS** (array): List of projects to cycle through
|
||||||
* **MODULES** (array): List of modules to run on each project
|
* **MODULES** (array): List of modules to run on each project
|
||||||
* **SUBMODULES** (array): List of submodules and their arguments to run on each project
|
* **WRAPPERS** (array): List of wrappers and their arguments to run on each project
|
||||||
* **EXCLUDE_MODULES** (array): List of modules not to run on each project
|
* **EXCLUDE_MODULES** (array): List of modules not to run on each project
|
||||||
* **DEBUG** (int): Turn debugging on
|
* **DEBUG** (int): Turn debugging on
|
||||||
* **YES** (int): Turn assume yes on
|
* **YES** (int): Turn assume yes on
|
||||||
@@ -147,10 +148,10 @@ Use a module to:
|
|||||||
|
|
||||||
* Build a new type of analysis from scratch
|
* Build a new type of analysis from scratch
|
||||||
* Generate project directories
|
* Generate project directories
|
||||||
* Group multiple submodules (and modules) into a larger task
|
* Group multiple wrappers (and modules) into a larger task
|
||||||
* Dictate the ordering of multiple submodules
|
* Dictate the ordering of multiple wrappers
|
||||||
* Competently handle pushd and popd for their submodules if they do not reside in the SCANS/PROJECT_DIR
|
* Competently handle pushd and popd for their wrappers if they do not reside in the SCANS/PROJECT_DIR
|
||||||
* Call their submodules with the appropriate arguments
|
* Call their wrappers with the appropriate arguments
|
||||||
|
|
||||||
### install_dependencies
|
### install_dependencies
|
||||||
|
|
||||||
@@ -204,6 +205,7 @@ TODO
|
|||||||
* MasterPlate_ file **should not be an xlsx file**, no portability
|
* MasterPlate_ file **should not be an xlsx file**, no portability
|
||||||
* We can keep the existing xlsx code for old style fallback
|
* We can keep the existing xlsx code for old style fallback
|
||||||
* But moving forward should switch to csv or something open
|
* But moving forward should switch to csv or something open
|
||||||
|
* Do we need to sync a QHTCP template?
|
||||||
|
|
||||||
NOTES
|
NOTES
|
||||||
|
|
||||||
@@ -218,6 +220,15 @@ NOTES
|
|||||||
|
|
||||||
Run the EASY matlab program
|
Run the EASY matlab program
|
||||||
|
|
||||||
|
INPUT FILES
|
||||||
|
* MasterPlate_.xls
|
||||||
|
* DrugMedia_.xls
|
||||||
|
|
||||||
|
OUTPUT FILES
|
||||||
|
* !!ResultsStd_.txt
|
||||||
|
* !!ResultsELr_.txt
|
||||||
|
|
||||||
|
|
||||||
TODO
|
TODO
|
||||||
|
|
||||||
* Don't create output in the scans folder, put it in an output directory
|
* Don't create output in the scans folder, put it in an output directory
|
||||||
@@ -612,11 +623,9 @@ TODO
|
|||||||
* **$5** (string): All_SGD_GOTerms_for_QHTCPtk.csv
|
* **$5** (string): All_SGD_GOTerms_for_QHTCPtk.csv
|
||||||
* **$6** (string): zscores_interaction.csv
|
* **$6** (string): zscores_interaction.csv
|
||||||
|
|
||||||
## Submodules
|
## Wrappers
|
||||||
|
|
||||||
Submodules are shell wrappers for workflow components in external languages
|
Wrappers:
|
||||||
|
|
||||||
Submodules:
|
|
||||||
|
|
||||||
* Allow scripts to be called by the main workflow script using input and output arguments as a translation mechanism.
|
* Allow scripts to be called by the main workflow script using input and output arguments as a translation mechanism.
|
||||||
* Only run by default if called by a module.
|
* Only run by default if called by a module.
|
||||||
@@ -665,7 +674,7 @@ Output
|
|||||||
|
|
||||||
*
|
*
|
||||||
|
|
||||||
This submodule:
|
This wrapper:
|
||||||
|
|
||||||
* Will perform both L and K comparisons for the specified experiment folders.
|
* Will perform both L and K comparisons for the specified experiment folders.
|
||||||
* The code uses the naming convention of PairwiseCompare_Exp’#’-Exp’#’ to standardize and keep simple the structural naming (where ‘X’ is either K or L and ‘Y’ is the number of the experiment GTA results to be found in ../GTAresult/Exp_).
|
* The code uses the naming convention of PairwiseCompare_Exp’#’-Exp’#’ to standardize and keep simple the structural naming (where ‘X’ is either K or L and ‘Y’ is the number of the experiment GTA results to be found in ../GTAresult/Exp_).
|
||||||
@@ -697,7 +706,7 @@ Output
|
|||||||
|
|
||||||
*
|
*
|
||||||
|
|
||||||
This submodule:
|
This wrapper:
|
||||||
|
|
||||||
* The Term Specific Heatmaps are produced directly from the ../ExpStudy/Exp_/ZScores/ZScores_Interaction.csv file generated by the user modified interaction… .R script.
|
* The Term Specific Heatmaps are produced directly from the ../ExpStudy/Exp_/ZScores/ZScores_Interaction.csv file generated by the user modified interaction… .R script.
|
||||||
* The heatmap labeling is per the names the user wrote into the StudyInfo.txt spreadsheet.
|
* The heatmap labeling is per the names the user wrote into the StudyInfo.txt spreadsheet.
|
||||||
@@ -716,11 +725,18 @@ This submodule:
|
|||||||
|
|
||||||
### r_interactions
|
### r_interactions
|
||||||
|
|
||||||
Run the R interactions analysis (Z_InteractionTemplate.R)
|
Run the R interactions analysis (deprecates Z_InteractionTemplate.R)
|
||||||
|
|
||||||
|
SCRIPT: interactions.R
|
||||||
|
|
||||||
TODO
|
TODO
|
||||||
|
|
||||||
* Don't want to rename Z_InteractionTemplate.R because that will break logic, just edit in place instead
|
* Parallelization (need to consult with Sean)
|
||||||
|
* Needs more loops to reduce verbosity, but don't want to limit flexibility
|
||||||
|
* Replace 1:length() with seq_along()
|
||||||
|
* Re-enable disabled linter checks
|
||||||
|
* Reduce cyclomatic complexity of some of the for loops
|
||||||
|
* There needs to be one point of truth for the SD factor
|
||||||
|
|
||||||
NOTES
|
NOTES
|
||||||
|
|
||||||
@@ -729,10 +745,11 @@ NOTES
|
|||||||
#### Arguments
|
#### Arguments
|
||||||
|
|
||||||
* **$1** (string): The input directory
|
* **$1** (string): The input directory
|
||||||
* **$2** (string): The study info file
|
* **$2** (string): The zscores directory
|
||||||
* **$3** (string): The zscores directory
|
* **$3** (string): The study info file
|
||||||
* **$4** (string): SGD_features.tab
|
* **$4** (string): SGD_features.tab
|
||||||
* **$5** (integer): delta SD background value (default: 5)
|
* **$5** (integer): delta SD background value (default: 5)
|
||||||
|
* **$6** (integer): experiment number
|
||||||
|
|
||||||
### r_join_interactions
|
### r_join_interactions
|
||||||
|
|
||||||
@@ -826,7 +843,7 @@ Output
|
|||||||
|
|
||||||
### pl_gtf_analyze
|
### pl_gtf_analyze
|
||||||
|
|
||||||
Perl analyze submodule
|
Perl analyze wrapper
|
||||||
This seems weird to me because we're just overwriting the same data for all set2 members
|
This seems weird to me because we're just overwriting the same data for all set2 members
|
||||||
https://metacpan.org/dist/GO-TermFinder/view/examples/analyze.pl
|
https://metacpan.org/dist/GO-TermFinder/view/examples/analyze.pl
|
||||||
Is there a reason you need a custom version and not the original from cpan?
|
Is there a reason you need a custom version and not the original from cpan?
|
||||||
@@ -840,7 +857,7 @@ Is there a reason you need a custom version and not the original from cpan?
|
|||||||
|
|
||||||
### pl_gtf_terms2tsv
|
### pl_gtf_terms2tsv
|
||||||
|
|
||||||
Perl terms2tsv submodule
|
Perl terms2tsv wrapper
|
||||||
Probably should be translated to shell/python
|
Probably should be translated to shell/python
|
||||||
|
|
||||||
#### Arguments
|
#### Arguments
|
||||||
@@ -849,7 +866,7 @@ Probably should be translated to shell/python
|
|||||||
|
|
||||||
### py_gtf_concat
|
### py_gtf_concat
|
||||||
|
|
||||||
Python concat submodule for GTF
|
Python concat wrapper for GTF
|
||||||
Concat the process ontology outputs from the /REMcReady_lm_only folder
|
Concat the process ontology outputs from the /REMcReady_lm_only folder
|
||||||
Probably should be translated to bash
|
Probably should be translated to bash
|
||||||
|
|
||||||
@@ -872,14 +889,14 @@ Parse study names from StudyInfo.csv files
|
|||||||
|
|
||||||
TODO
|
TODO
|
||||||
|
|
||||||
* This whole submodule should eventually be either
|
* This whole wrapper should eventually be either
|
||||||
* Removed
|
* Removed
|
||||||
* Expanded into a file that stores all project/study settings (database)
|
* Expanded into a file that stores all project/study settings (database)
|
||||||
* I had to had a new line to the end of StudyInfo.csv, may break things?
|
* I had to had a new line to the end of StudyInfo.csv, may break things?
|
||||||
|
|
||||||
#### Arguments
|
#### Arguments
|
||||||
|
|
||||||
* **$1** (string): File to read
|
* **$1** (string): Study info file
|
||||||
|
|
||||||
#### Variables set
|
#### Variables set
|
||||||
|
|
||||||
@@ -890,3 +907,21 @@ TODO
|
|||||||
* **0**: If one or more studies found
|
* **0**: If one or more studies found
|
||||||
* **1**: If no studies found
|
* **1**: If no studies found
|
||||||
|
|
||||||
|
### choose_easy_results
|
||||||
|
|
||||||
|
Chooses an EASY scans directory if the information is undefined
|
||||||
|
TODO Standardize EASY output, it's hard to understand
|
||||||
|
TODO eventually we could run this on multiple results dirs simultaneously with some refactoring
|
||||||
|
|
||||||
|
#### Arguments
|
||||||
|
|
||||||
|
* **$1** (string): directory containing EASY results dirs
|
||||||
|
|
||||||
|
#### Variables set
|
||||||
|
|
||||||
|
* **EASY_RESULTS_DIR** (string): The working EASY output directory
|
||||||
|
|
||||||
|
#### Exit codes
|
||||||
|
|
||||||
|
* **0**: if successfully choose anEASY results dir
|
||||||
|
|
||||||
|
|||||||
@@ -7,7 +7,7 @@ library(plyr)
|
|||||||
library(dplyr)
|
library(dplyr)
|
||||||
library(sos)
|
library(sos)
|
||||||
|
|
||||||
args=commandArgs(TRUE)
|
args <- commandArgs(TRUE)
|
||||||
|
|
||||||
if (length(args) >= 1) {
|
if (length(args) >= 1) {
|
||||||
finalTable <- args[1]
|
finalTable <- args[1]
|
||||||
@@ -33,59 +33,58 @@ if (length(args) >= 4) {
|
|||||||
output <- "REMcHeatmaps/REMcWithShift.csv" # for legacy workflow
|
output <- "REMcHeatmaps/REMcWithShift.csv" # for legacy workflow
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
# Read in the REMc finalTable data
|
# Read in the REMc finalTable data
|
||||||
X = data.frame(read.csv(file=finalTable,header=TRUE,stringsAsFactors = FALSE))
|
X <- data.frame(read.csv(file = finalTable, header = TRUE, stringsAsFactors = FALSE))
|
||||||
# Read in the shift data From ../JoinInteractions
|
# Read in the shift data From ../JoinInteractions
|
||||||
Y = data.frame(read.csv(file=shiftFile,header=TRUE,stringsAsFactors = FALSE))
|
Y <- data.frame(read.csv(file = shiftFile, header = TRUE, stringsAsFactors = FALSE))
|
||||||
Labels <- read.delim(studyInfo,skip=0,as.is=T,row.names=1,strip.white=TRUE)
|
Labels <- read.delim(studyInfo, skip = 0, as.is = TRUE, row.names = 1, strip.white = TRUE)
|
||||||
|
|
||||||
# Determine the number of cols - needed to create the correct number of new cols
|
# Determine the number of cols - needed to create the correct number of new cols
|
||||||
Xcolnum <- length(X[1, ])
|
Xcolnum <- length(X[1, ])
|
||||||
ADDnum <- Xcolnum + length(Y[1, ]) - 2
|
ADDnum <- Xcolnum + length(Y[1, ]) - 2
|
||||||
|
|
||||||
# Create new columns filled with NAs to be filled with data
|
# Create new columns filled with NAs to be filled with data
|
||||||
Xtemp= X
|
Xtemp <- X
|
||||||
Xtemp[, (Xcolnum + 1):ADDnum] <- NA
|
Xtemp[, (Xcolnum + 1):ADDnum] <- NA
|
||||||
|
|
||||||
# Match the orf names in each row to a orf name in the shift data file and then add the shift data to the finalTable file
|
# Match the orf names in each row to a orf name in the shift data file and then add the shift data to the finalTable file
|
||||||
shiftTbl < - as.data.frame(matrix(nrow = 1, ncol = length(Y) - 2)) #the df shiftTbl must be initialized before for loop
|
shiftTbl < - as.data.frame(matrix(nrow = 1, ncol = length(Y) - 2)) #the df shiftTbl must be initialized before for loop
|
||||||
|
|
||||||
for (i in 1:length(X[, 1])) {
|
for (i in 1:length(X[, 1])) {
|
||||||
Shiftrownum = match(X[i,2],Y[,1])
|
Shiftrownum <- match(X[i, 2], Y[, 1])
|
||||||
shiftTbl[i,]= Y[Shiftrownum,3:length(Y[1,])]
|
shiftTbl[i, ] <- Y[Shiftrownum, 3:length(Y[1, ])]
|
||||||
Xtemp[i, (Xcolnum + 1):ADDnum] <- Y[Shiftrownum, 3:length(Y[1, ])]
|
Xtemp[i, (Xcolnum + 1):ADDnum] <- Y[Shiftrownum, 3:length(Y[1, ])]
|
||||||
}
|
}
|
||||||
headerX= colnames(Xtemp)
|
headerX <- colnames(Xtemp)
|
||||||
headerY= colnames(Y)
|
headerY <- colnames(Y)
|
||||||
shfHdr= headerY[3:length(headerY)]
|
shfHdr <- headerY[3:length(headerY)]
|
||||||
combTbl <- X[, 1:3]
|
combTbl <- X[, 1:3]
|
||||||
lmTbl= select(Xtemp, contains('Z_lm')) #X[,(4:Xcolnum-2)]
|
lmTbl <- select(Xtemp, contains("Z_lm")) #X[,(4:Xcolnum-2)]
|
||||||
shiftTbl<- select(Xtemp, contains('V'))
|
shiftTbl <- select(Xtemp, contains("V"))
|
||||||
clustTbl<- select(Xtemp, contains('cluster.'))
|
clustTbl <- select(Xtemp, contains("cluster."))
|
||||||
|
|
||||||
# Give the new column names the same names as in the shift file
|
# Give the new column names the same names as in the shift file
|
||||||
Xcols = colnames(X)
|
Xcols <- colnames(X)
|
||||||
Ycols = colnames(Y)[3:length(Y[1,])]
|
Ycols <- colnames(Y)[3:length(Y[1, ])]
|
||||||
newCols = c(Xcols[1:Xcolnum],Ycols)
|
newCols <- c(Xcols[1:Xcolnum], Ycols)
|
||||||
|
|
||||||
# Reorder columns for generating heatmaps
|
# Reorder columns for generating heatmaps
|
||||||
combI= combTbl #Starting Template orf, Genename columns
|
combI <- combTbl #Starting Template orf, Genename columns
|
||||||
headersRemc <- newCols #colnames(X)
|
headersRemc <- newCols #colnames(X)
|
||||||
newHeaders= newCols[1:3]
|
newHeaders <- newCols[1:3]
|
||||||
lmHdr= colnames(lmTbl) #newCols[4:(length(Xcols)-2)]
|
lmHdr <- colnames(lmTbl) #newCols[4:(length(Xcols)-2)]
|
||||||
clstHdr= colnames(clustTbl) #select(newCols, contains('cluster.')) #newCols[3+length(lmHdr):2]
|
clstHdr <- colnames(clustTbl) #select(newCols, contains('cluster.')) #newCols[3+length(lmHdr):2]
|
||||||
|
|
||||||
intLvHdr= vector()
|
intLvHdr <- vector()
|
||||||
#Reorder columns to produce an interleaved set of Z_lm and Shift data for all the cpps.
|
#Reorder columns to produce an interleaved set of Z_lm and Shift data for all the cpps.
|
||||||
for (i in 1:(length(shiftTbl[1, ]))) {
|
for (i in 1:(length(shiftTbl[1, ]))) {
|
||||||
combI=cbind.data.frame(combI, shiftTbl[i])
|
combI <- cbind.data.frame(combI, shiftTbl[i])
|
||||||
combI=cbind.data.frame(combI, lmTbl[i])
|
combI <- cbind.data.frame(combI, lmTbl[i])
|
||||||
intLvHdrx= c(shfHdr[i],lmHdr[i] )
|
intLvHdrx <- c(shfHdr[i], lmHdr[i])
|
||||||
intLvHdr= c(intLvHdr,intLvHdrx)
|
intLvHdr <- c(intLvHdr, intLvHdrx)
|
||||||
}
|
}
|
||||||
|
|
||||||
combIHdr= c(colnames(combTbl),intLvHdr,clstHdr)
|
combIHdr <- c(colnames(combTbl), intLvHdr, clstHdr)
|
||||||
combI=cbind.data.frame(combI, clustTbl)
|
combI <- cbind.data.frame(combI, clustTbl)
|
||||||
colnames(combI)= combIHdr
|
colnames(combI) <- combIHdr
|
||||||
write.csv(combI, file = output, row.names = FALSE)
|
write.csv(combI, file = output, row.names = FALSE)
|
||||||
|
|||||||
@@ -9,13 +9,13 @@ args <- commandArgs(TRUE)
|
|||||||
|
|
||||||
# Set output dir
|
# Set output dir
|
||||||
if (length(args) >= 1) {
|
if (length(args) >= 1) {
|
||||||
input_finalTable <- args[1]
|
input_finalTable <- file.path(args[1])
|
||||||
} else {
|
} else {
|
||||||
input_finalTable <- "/REMcHeatmaps/REMcWithShift.csv" # for legacy workflow
|
input_finalTable <- "/REMcHeatmaps/REMcWithShift.csv" # for legacy workflow
|
||||||
}
|
}
|
||||||
|
|
||||||
if (length(args) >= 2) {
|
if (length(args) >= 2) {
|
||||||
outDir <- args[2]
|
outDir <- file.path(args[2])
|
||||||
} else {
|
} else {
|
||||||
outDir <- "/REMcHeatmaps/REMcWithShift.csv" # for legacy workflow
|
outDir <- "/REMcHeatmaps/REMcWithShift.csv" # for legacy workflow
|
||||||
}
|
}
|
||||||
@@ -31,11 +31,12 @@ hmapfile[hmapfile == -0.001] <- NA
|
|||||||
num_total_genes <- length(hmapfile[, 1])
|
num_total_genes <- length(hmapfile[, 1])
|
||||||
|
|
||||||
# Break out the cluster names so each part of the cluster origin can be accessed
|
# Break out the cluster names so each part of the cluster origin can be accessed
|
||||||
# line below removed because it adds to many genes to clusters when going past 1-0-10 since it cannot differentiate between 1-0-1 and 1-0-10 when using grepl.
|
# line below removed because it adds to many genes to clusters when going past 1-0-10
|
||||||
|
# since it cannot differentiate between 1-0-1 and 1-0-10 when using grepl.
|
||||||
# hmapfile$cluster.origin = gsub(" ","",x=hmapfile$cluster.origin)
|
# hmapfile$cluster.origin = gsub(" ","",x=hmapfile$cluster.origin)
|
||||||
|
|
||||||
hmapfile$cluster.origin = gsub(";"," ;",x=hmapfile$cluster.origin)
|
hmapfile$cluster.origin <- gsub(";", " ;", x = hmapfile$cluster.origin)
|
||||||
hmapfile$cluster.origin = strsplit(hmapfile$cluster.origin,';')
|
hmapfile$cluster.origin <- strsplit(hmapfile$cluster.origin, ";")
|
||||||
#use tail(x,n) for accessing the outward most cluster
|
#use tail(x,n) for accessing the outward most cluster
|
||||||
|
|
||||||
clust_rounds <- 0
|
clust_rounds <- 0
|
||||||
@@ -54,7 +55,8 @@ num_unique_clusts <- length(unique_clusts)
|
|||||||
|
|
||||||
# Base the color key on a statistical analysis of the L and K data
|
# Base the color key on a statistical analysis of the L and K data
|
||||||
# need to create "breaks" to set the color key, need to have 12 different breaks (for 11 colors)
|
# need to create "breaks" to set the color key, need to have 12 different breaks (for 11 colors)
|
||||||
# scale() will calculate the mean and standard deviation of the entire vector, then "scale" each element by those values by subtracting the mean and dividing by the sd.
|
# scale() will calculate the mean and standard deviation of the entire vector,
|
||||||
|
# then "scale" each element by those values by subtracting the mean and dividing by the sd.
|
||||||
# hmapfile[,4:(length(hmapfile[1,]) - 2)] <- scale(hmapfile[,4:(length(hmapfile[1,]) - 2)])
|
# hmapfile[,4:(length(hmapfile[1,]) - 2)] <- scale(hmapfile[,4:(length(hmapfile[1,]) - 2)])
|
||||||
# change so that the L data is multiplied to be on the same scale as the K data
|
# change so that the L data is multiplied to be on the same scale as the K data
|
||||||
|
|
||||||
@@ -101,12 +103,15 @@ print(KEY_MIN)
|
|||||||
print(L_MAX)
|
print(L_MAX)
|
||||||
# print(L_Multiplier)
|
# print(L_Multiplier)
|
||||||
|
|
||||||
colormapbreaks <- c(KEY_MIN,KEY_MIN*(5/6),KEY_MIN*(4/6),KEY_MIN*(3/6),KEY_MIN*(2/6),KEY_MIN*(1/6),KEY_MAX*(1/6),KEY_MAX*(2/6),KEY_MAX*(3/6),KEY_MAX*(4/6),KEY_MAX*(5/6),KEY_MAX)
|
colormapbreaks <- c(KEY_MIN, KEY_MIN * (5 / 6), KEY_MIN * (4 / 6), KEY_MIN * (3 / 6), KEY_MIN * (2 / 6),
|
||||||
|
KEY_MIN * (1 / 6), KEY_MAX * (1 / 6), KEY_MAX * (2 / 6), KEY_MAX * (3 / 6), KEY_MAX * (4 / 6), KEY_MAX * (5 / 6), KEY_MAX)
|
||||||
# print(colormapbreaks)
|
# print(colormapbreaks)
|
||||||
|
|
||||||
# Probably should give a way to detect shift in case that is is not in the first row... (maybe just grepl for the whole column name?)
|
# Probably should give a way to detect shift in case that is is not in the first row... (maybe just grepl for the whole column name?)
|
||||||
# However since also using this to amend the first part. Could possibly identify all the ones that contain the word shift and then create an object containing just those numbers
|
# However since also using this to amend the first part.
|
||||||
# then could just use these values and create spaces only between interaction values - possibly could get rid of redundant shift values if we don't want to view these
|
# Could possibly identify all the ones that contain the word shift and then create an object containing just those numbers
|
||||||
|
# then could just use these values and create spaces only between interaction values
|
||||||
|
# possibly could get rid of redundant shift values if we don't want to view these
|
||||||
# could we pool all the shift data/average it?
|
# could we pool all the shift data/average it?
|
||||||
if (grepl("Shift", colnames(hmapfile)[4], fixed = TRUE) == TRUE) {
|
if (grepl("Shift", colnames(hmapfile)[4], fixed = TRUE) == TRUE) {
|
||||||
even_columns <- seq(from = 2, to = (length(hmapfile[1, ]) - 7), by = 2)
|
even_columns <- seq(from = 2, to = (length(hmapfile[1, ]) - 7), by = 2)
|
||||||
@@ -149,7 +154,6 @@ for(i in 1:length(colnames_edit)){
|
|||||||
# if(length(unlist(INT_store)) == 6){
|
# if(length(unlist(INT_store)) == 6){
|
||||||
# colnames_edit[i+1] <- paste(unlist(INT_store)[1],unlist(INT_store)[2],unlist(INT_store)[6],sep=" ")
|
# colnames_edit[i+1] <- paste(unlist(INT_store)[1],unlist(INT_store)[2],unlist(INT_store)[6],sep=" ")
|
||||||
# }
|
# }
|
||||||
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
print(colnames_edit)
|
print(colnames_edit)
|
||||||
@@ -159,7 +163,6 @@ print(colnames_edit)
|
|||||||
# colnames_edit[15] <- "TEM HLEG L"
|
# colnames_edit[15] <- "TEM HLEG L"
|
||||||
# colnames_edit[20] <- "TEM HL L"
|
# colnames_edit[20] <- "TEM HL L"
|
||||||
|
|
||||||
|
|
||||||
# Create the heatmaps
|
# Create the heatmaps
|
||||||
for (i in 1:num_unique_clusts) {
|
for (i in 1:num_unique_clusts) {
|
||||||
cluster <- unique_clusts[i]
|
cluster <- unique_clusts[i]
|
||||||
@@ -168,9 +171,10 @@ for(i in 1:num_unique_clusts){
|
|||||||
if (cluster_length != 1) {
|
if (cluster_length != 1) {
|
||||||
X0 <- as.matrix(cluster_data[, 4:(length(hmapfile[1, ]) - 2)])
|
X0 <- as.matrix(cluster_data[, 4:(length(hmapfile[1, ]) - 2)])
|
||||||
if (cluster_length >= 2001) {
|
if (cluster_length >= 2001) {
|
||||||
mypath = file.path(outDir,paste("cluster_",gsub(" ","",cluster), ".pdf",sep=""))
|
mypath <- file.path(outDir, paste("cluster_", gsub(" ", "", cluster), sep = ""), ".pdf")
|
||||||
pdf(file = mypath, height = 20, width = 15)
|
pdf(file = mypath, height = 20, width = 15)
|
||||||
heatmap.2(x=X0,
|
heatmap.2(
|
||||||
|
x = X0,
|
||||||
Rowv = TRUE, Colv = NA, distfun = dist, hclustfun = hclust,
|
Rowv = TRUE, Colv = NA, distfun = dist, hclustfun = hclust,
|
||||||
dendrogram = "row", cexCol = 0.8, cexRow = 0.1, scale = "none",
|
dendrogram = "row", cexCol = 0.8, cexRow = 0.1, scale = "none",
|
||||||
breaks = colormapbreaks, symbreaks = FALSE, colsep = even_columns, sepcolor = "white", offsetCol = 0.1,
|
breaks = colormapbreaks, symbreaks = FALSE, colsep = even_columns, sepcolor = "white", offsetCol = 0.1,
|
||||||
@@ -181,14 +185,16 @@ for(i in 1:num_unique_clusts){
|
|||||||
na.color = "red", col = brewer.pal(11, "PuOr"),
|
na.color = "red", col = brewer.pal(11, "PuOr"),
|
||||||
main = cluster,
|
main = cluster,
|
||||||
# ColSideColors=ev_repeat,
|
# ColSideColors=ev_repeat,
|
||||||
labRow=as.character(cluster_data$Gene), labCol=colnames_edit)
|
labRow = as.character(cluster_data$Gene), labCol = colnames_edit
|
||||||
|
)
|
||||||
# abline(v=0.5467,col="black")
|
# abline(v=0.5467,col="black")
|
||||||
dev.off()
|
dev.off()
|
||||||
}
|
}
|
||||||
if (cluster_length >= 201 && cluster_length <= 2000) {
|
if (cluster_length >= 201 && cluster_length <= 2000) {
|
||||||
mypath = file.path(outDir,paste("cluster_",gsub(" ","",cluster), ".pdf",sep=""))
|
mypath <- file.path(outDir, paste("cluster_", gsub(" ", "", cluster), sep = ""), ".pdf")
|
||||||
pdf(file = mypath, height = 15, width = 12)
|
pdf(file = mypath, height = 15, width = 12)
|
||||||
heatmap.2(x=X0,
|
heatmap.2(
|
||||||
|
x = X0,
|
||||||
Rowv = TRUE, Colv = NA, distfun = dist, hclustfun = hclust,
|
Rowv = TRUE, Colv = NA, distfun = dist, hclustfun = hclust,
|
||||||
dendrogram = "row", cexCol = 0.8, cexRow = 0.1, scale = "none",
|
dendrogram = "row", cexCol = 0.8, cexRow = 0.1, scale = "none",
|
||||||
breaks = colormapbreaks, symbreaks = FALSE, colsep = even_columns, sepcolor = "white", offsetCol = 0.1,
|
breaks = colormapbreaks, symbreaks = FALSE, colsep = even_columns, sepcolor = "white", offsetCol = 0.1,
|
||||||
@@ -198,14 +204,16 @@ for(i in 1:num_unique_clusts){
|
|||||||
keysize = 0.7, trace = "none", density.info = c("none"), margins = c(10, 8),
|
keysize = 0.7, trace = "none", density.info = c("none"), margins = c(10, 8),
|
||||||
na.color = "red", col = brewer.pal(11, "PuOr"),
|
na.color = "red", col = brewer.pal(11, "PuOr"),
|
||||||
main = cluster,
|
main = cluster,
|
||||||
labRow=as.character(cluster_data$Gene), labCol=colnames_edit)
|
labRow = as.character(cluster_data$Gene), labCol = colnames_edit
|
||||||
|
)
|
||||||
# abline(v=0.5316,col="black")
|
# abline(v=0.5316,col="black")
|
||||||
dev.off()
|
dev.off()
|
||||||
}
|
}
|
||||||
if (cluster_length >= 150 && cluster_length <= 200) {
|
if (cluster_length >= 150 && cluster_length <= 200) {
|
||||||
mypath = file.path(outDir,paste("cluster_",gsub(" ","",cluster), ".pdf",sep=""))
|
mypath <- file.path(outDir, paste("cluster_", gsub(" ", "", cluster), sep = ""), ".pdf")
|
||||||
pdf(file = mypath, height = 12, width = 12)
|
pdf(file = mypath, height = 12, width = 12)
|
||||||
heatmap.2(x=X0,
|
heatmap.2(
|
||||||
|
x = X0,
|
||||||
Rowv = TRUE, Colv = NA, distfun = dist, hclustfun = hclust,
|
Rowv = TRUE, Colv = NA, distfun = dist, hclustfun = hclust,
|
||||||
dendrogram = "row", cexCol = 0.8, cexRow = 0.1, scale = "none",
|
dendrogram = "row", cexCol = 0.8, cexRow = 0.1, scale = "none",
|
||||||
breaks = colormapbreaks, symbreaks = FALSE, colsep = even_columns, sepcolor = "white", offsetCol = 0.1,
|
breaks = colormapbreaks, symbreaks = FALSE, colsep = even_columns, sepcolor = "white", offsetCol = 0.1,
|
||||||
@@ -215,13 +223,15 @@ for(i in 1:num_unique_clusts){
|
|||||||
keysize = 1, trace = "none", density.info = c("none"), margins = c(10, 8),
|
keysize = 1, trace = "none", density.info = c("none"), margins = c(10, 8),
|
||||||
na.color = "red", col = brewer.pal(11, "PuOr"),
|
na.color = "red", col = brewer.pal(11, "PuOr"),
|
||||||
main = cluster,
|
main = cluster,
|
||||||
labRow=as.character(cluster_data$Gene), labCol=colnames_edit)
|
labRow = as.character(cluster_data$Gene), labCol = colnames_edit
|
||||||
|
)
|
||||||
dev.off()
|
dev.off()
|
||||||
}
|
}
|
||||||
if (cluster_length >= 101 && cluster_length <= 149) {
|
if (cluster_length >= 101 && cluster_length <= 149) {
|
||||||
mypath = file.path(outDir,paste("cluster_",gsub(" ","",cluster), ".pdf",sep=""))
|
mypath <- file.path(outDir, paste("cluster_", gsub(" ", "", cluster), sep = ""), ".pdf")
|
||||||
pdf(file = mypath, mypath, height = 12, width = 12)
|
pdf(file = mypath, mypath, height = 12, width = 12)
|
||||||
heatmap.2(x=X0,
|
heatmap.2(
|
||||||
|
x = X0,
|
||||||
Rowv = TRUE, Colv = NA, distfun = dist, hclustfun = hclust,
|
Rowv = TRUE, Colv = NA, distfun = dist, hclustfun = hclust,
|
||||||
dendrogram = "row", cexCol = 0.8, cexRow = 0.2, scale = "none",
|
dendrogram = "row", cexCol = 0.8, cexRow = 0.2, scale = "none",
|
||||||
breaks = colormapbreaks, symbreaks = FALSE, colsep = even_columns, sepcolor = "white", offsetCol = 0.1,
|
breaks = colormapbreaks, symbreaks = FALSE, colsep = even_columns, sepcolor = "white", offsetCol = 0.1,
|
||||||
@@ -231,13 +241,15 @@ for(i in 1:num_unique_clusts){
|
|||||||
keysize = 1, trace = "none", density.info = c("none"), margins = c(10, 8),
|
keysize = 1, trace = "none", density.info = c("none"), margins = c(10, 8),
|
||||||
na.color = "red", col = brewer.pal(11, "PuOr"),
|
na.color = "red", col = brewer.pal(11, "PuOr"),
|
||||||
main = cluster,
|
main = cluster,
|
||||||
labRow=as.character(cluster_data$Gene), labCol=colnames_edit)
|
labRow = as.character(cluster_data$Gene), labCol = colnames_edit
|
||||||
|
)
|
||||||
dev.off()
|
dev.off()
|
||||||
}
|
}
|
||||||
if (cluster_length >= 60 && cluster_length <= 100) {
|
if (cluster_length >= 60 && cluster_length <= 100) {
|
||||||
mypath = file.path(outDir,paste("cluster_",gsub(" ","",cluster), ".pdf",sep=""))
|
mypath <- file.path(outDir, paste("cluster_", gsub(" ", "", cluster), sep = ""), ".pdf")
|
||||||
pdf(file = mypath, height = 12, width = 12)
|
pdf(file = mypath, height = 12, width = 12)
|
||||||
heatmap.2(x=X0,
|
heatmap.2(
|
||||||
|
x = X0,
|
||||||
Rowv = TRUE, Colv = NA, distfun = dist, hclustfun = hclust,
|
Rowv = TRUE, Colv = NA, distfun = dist, hclustfun = hclust,
|
||||||
dendrogram = "row", cexCol = 0.8, cexRow = 0.4, scale = "none",
|
dendrogram = "row", cexCol = 0.8, cexRow = 0.4, scale = "none",
|
||||||
breaks = colormapbreaks, symbreaks = FALSE, colsep = even_columns, sepcolor = "white", offsetCol = 0.1,
|
breaks = colormapbreaks, symbreaks = FALSE, colsep = even_columns, sepcolor = "white", offsetCol = 0.1,
|
||||||
@@ -247,13 +259,15 @@ for(i in 1:num_unique_clusts){
|
|||||||
keysize = 1, trace = "none", density.info = c("none"), margins = c(10, 8),
|
keysize = 1, trace = "none", density.info = c("none"), margins = c(10, 8),
|
||||||
na.color = "red", col = brewer.pal(11, "PuOr"),
|
na.color = "red", col = brewer.pal(11, "PuOr"),
|
||||||
main = cluster,
|
main = cluster,
|
||||||
labRow=as.character(cluster_data$Gene), labCol=colnames_edit)
|
labRow = as.character(cluster_data$Gene), labCol = colnames_edit
|
||||||
|
)
|
||||||
dev.off()
|
dev.off()
|
||||||
}
|
}
|
||||||
if (cluster_length <= 59 && cluster_length >= 30) {
|
if (cluster_length <= 59 && cluster_length >= 30) {
|
||||||
mypath = file.path(outDir,paste("cluster_",gsub(" ","",cluster), ".pdf",sep=""))
|
mypath <- file.path(outDir, paste("cluster_", gsub(" ", "", cluster), sep = ""), ".pdf")
|
||||||
pdf(file = mypath, height = 9, width = 12)
|
pdf(file = mypath, height = 9, width = 12)
|
||||||
heatmap.2(x=X0,
|
heatmap.2(
|
||||||
|
x = X0,
|
||||||
Rowv = TRUE, Colv = NA, distfun = dist, hclustfun = hclust,
|
Rowv = TRUE, Colv = NA, distfun = dist, hclustfun = hclust,
|
||||||
dendrogram = "row", cexCol = 0.8, cexRow = 0.6, scale = "none",
|
dendrogram = "row", cexCol = 0.8, cexRow = 0.6, scale = "none",
|
||||||
breaks = colormapbreaks, symbreaks = FALSE, colsep = even_columns, sepcolor = "white", offsetCol = 0.1,
|
breaks = colormapbreaks, symbreaks = FALSE, colsep = even_columns, sepcolor = "white", offsetCol = 0.1,
|
||||||
@@ -263,13 +277,15 @@ for(i in 1:num_unique_clusts){
|
|||||||
keysize = 1, trace = "none", density.info = c("none"), margins = c(10, 8),
|
keysize = 1, trace = "none", density.info = c("none"), margins = c(10, 8),
|
||||||
na.color = "red", col = brewer.pal(11, "PuOr"),
|
na.color = "red", col = brewer.pal(11, "PuOr"),
|
||||||
main = cluster,
|
main = cluster,
|
||||||
labRow=as.character(cluster_data$Gene), labCol=colnames_edit)
|
labRow = as.character(cluster_data$Gene), labCol = colnames_edit
|
||||||
|
)
|
||||||
dev.off()
|
dev.off()
|
||||||
}
|
}
|
||||||
if (cluster_length <= 29) {
|
if (cluster_length <= 29) {
|
||||||
mypath = file.path(outDir,paste("cluster_",gsub(" ","",cluster), ".pdf",sep=""))
|
mypath <- file.path(outDir, paste("cluster_", gsub(" ", "", cluster), sep = ""), ".pdf")
|
||||||
pdf(file = mypath, height = 7, width = 12)
|
pdf(file = mypath, height = 7, width = 12)
|
||||||
heatmap.2(x=X0,
|
heatmap.2(
|
||||||
|
x = X0,
|
||||||
Rowv = TRUE, Colv = NA,
|
Rowv = TRUE, Colv = NA,
|
||||||
distfun = dist, hclustfun = hclust,
|
distfun = dist, hclustfun = hclust,
|
||||||
dendrogram = "row", cexCol = 0.8, cexRow = 0.9, scale = "none",
|
dendrogram = "row", cexCol = 0.8, cexRow = 0.9, scale = "none",
|
||||||
@@ -280,7 +296,8 @@ for(i in 1:num_unique_clusts){
|
|||||||
keysize = 1, trace = "none", density.info = c("none"), margins = c(10, 8),
|
keysize = 1, trace = "none", density.info = c("none"), margins = c(10, 8),
|
||||||
na.color = "red", col = brewer.pal(11, "PuOr"),
|
na.color = "red", col = brewer.pal(11, "PuOr"),
|
||||||
main = cluster,
|
main = cluster,
|
||||||
labRow=as.character(cluster_data$Gene), labCol=colnames_edit)
|
labRow = as.character(cluster_data$Gene), labCol = colnames_edit
|
||||||
|
)
|
||||||
dev.off()
|
dev.off()
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -20,25 +20,25 @@ library(htmlwidgets)
|
|||||||
|
|
||||||
# Parse arguments
|
# Parse arguments
|
||||||
args <- commandArgs(TRUE)
|
args <- commandArgs(TRUE)
|
||||||
inputFile <- args[1]
|
inputFile <- file.path(args[1])
|
||||||
|
|
||||||
# Set output dir
|
# Set output dir
|
||||||
if (length(args) >= 2) {
|
if (length(args) >= 2) {
|
||||||
outDir <- args[2]
|
outDir <- file.path(args[2])
|
||||||
} else {
|
} else {
|
||||||
outDir <- "/ZScores/" # for legacy workflow
|
outDir <- "/ZScores/" # for legacy workflow
|
||||||
}
|
}
|
||||||
|
|
||||||
# Set StudyInfo file path
|
# Set StudyInfo file path
|
||||||
if (length(args) >= 3) {
|
if (length(args) >= 3) {
|
||||||
studyInfo <- args[3]
|
studyInfo <- file.path(args[3])
|
||||||
} else {
|
} else {
|
||||||
studyInfo <- "../Code/StudyInfo.csv" # for legacy workflow
|
studyInfo <- "../Code/StudyInfo.csv" # for legacy workflow
|
||||||
}
|
}
|
||||||
|
|
||||||
# Set SGDgeneList file path
|
# Set SGDgeneList file path
|
||||||
if (length(args) >= 4) {
|
if (length(args) >= 4) {
|
||||||
SGDgeneList <- args[4]
|
SGDgeneList <- file.path(args[4])
|
||||||
} else {
|
} else {
|
||||||
SGDgeneList <- "../Code/SGD_features.tab" # for legacy workflow
|
SGDgeneList <- "../Code/SGD_features.tab" # for legacy workflow
|
||||||
}
|
}
|
||||||
@@ -54,7 +54,7 @@ if (length(args) >= 5) {
|
|||||||
}
|
}
|
||||||
delBGFactor <- as.numeric(delBGFactor)
|
delBGFactor <- as.numeric(delBGFactor)
|
||||||
if (is.na(delBGFactor)) {
|
if (is.na(delBGFactor)) {
|
||||||
delBGFactor <- 3 # Recommended by Sean
|
delBGFactor <- 3 # recommended by Sean
|
||||||
}
|
}
|
||||||
print(paste("The Standard Deviation Value is:", delBGFactor))
|
print(paste("The Standard Deviation Value is:", delBGFactor))
|
||||||
|
|
||||||
@@ -68,21 +68,25 @@ if (length(args) >= 6) {
|
|||||||
}
|
}
|
||||||
expNumber <- as.numeric(expNumber)
|
expNumber <- as.numeric(expNumber)
|
||||||
|
|
||||||
outDir_QC <- paste(outDir, "QC/", sep = "")
|
outDir_QC <- file.path(outDir, "QC")
|
||||||
|
|
||||||
if (!file.exists(outDir)) {
|
if (!file.exists(outDir)) {
|
||||||
dir.create(file.path(outDir))
|
dir.create(outDir)
|
||||||
}
|
}
|
||||||
|
|
||||||
if (!file.exists(outDir_QC)) {
|
if (!file.exists(outDir_QC)) {
|
||||||
dir.create(file.path(outDir_QC))
|
dir.create(file.path(outDir_QC))
|
||||||
}
|
}
|
||||||
|
|
||||||
# Capture Exp_ number, use it to Save args[2]{std}to Labels field and then
|
options(width = 1000)
|
||||||
# write to Labels to studyInfo.txt for future reference
|
ls.str()
|
||||||
Labels <- read.csv(file = studyInfo, stringsAsFactors = FALSE) # sep = ","
|
|
||||||
Labels[expNumber, 3] <- delBGFactor
|
# Write delBJFactor to the StudyInfo file
|
||||||
write.csv(Labels, file = studyInfo, row.names = FALSE)
|
# TODO we probably shouldn't be doing this, need one source of truth
|
||||||
|
# TODO disabling this for now
|
||||||
|
# Labels <- read.csv(file = studyInfo, stringsAsFactors = FALSE) # sep = ","
|
||||||
|
# Labels[expNumber, 3] <- delBGFactor
|
||||||
|
# write.csv(Labels, file = studyInfo, row.names = FALSE)
|
||||||
|
|
||||||
# Begin User Data Selection Section
|
# Begin User Data Selection Section
|
||||||
|
|
||||||
@@ -366,19 +370,19 @@ Raw_l_vs_K_beforeQC <-
|
|||||||
ggtitle("Raw L vs K before QC") +
|
ggtitle("Raw L vs K before QC") +
|
||||||
theme_publication_legend_right()
|
theme_publication_legend_right()
|
||||||
|
|
||||||
pdf(paste(outDir_QC, "Raw_L_vs_K_beforeQC.pdf", sep = ""), width = 12, height = 8)
|
pdf(file.path(outDir_QC, "Raw_L_vs_K_beforeQC.pdf"), width = 12, height = 8)
|
||||||
Raw_l_vs_K_beforeQC
|
Raw_l_vs_K_beforeQC
|
||||||
|
|
||||||
dev.off()
|
dev.off()
|
||||||
|
|
||||||
pgg <- ggplotly(Raw_l_vs_K_beforeQC)
|
pgg <- ggplotly(Raw_l_vs_K_beforeQC)
|
||||||
plotly_path <- paste(outDir_QC, "Raw_L_vs_K_beforeQC.html", sep = "")
|
plotly_path <- file.path(outDir_QC, "Raw_L_vs_K_beforeQC.html")
|
||||||
saveWidget(pgg, file = plotly_path, selfcontained = TRUE)
|
saveWidget(pgg, file = plotly_path, selfcontained = TRUE)
|
||||||
|
|
||||||
# Set delta background tolerance based on 3 sds from the mean delta background
|
# Set delta background tolerance based on 3 sds from the mean delta background
|
||||||
Delta_Background_Tolerance <- mean(X$Delta_Backgrd) + (delBGFactor * sd(X$Delta_Backgrd))
|
Delta_Background_Tolerance <- mean(X$Delta_Backgrd) + (delBGFactor * sd(X$Delta_Backgrd))
|
||||||
# Delta_Background_Tolerance <- mean(X$Delta_Backgrd)+(3*sd(X$Delta_Backgrd))
|
# Delta_Background_Tolerance <- mean(X$Delta_Backgrd)+(3*sd(X$Delta_Backgrd))
|
||||||
print(paste("Delta_Background_Tolerance is", Delta_Background_Tolerance, sep = " "))
|
sprintf("Delta_Background_Tolerance is %f", Delta_Background_Tolerance)
|
||||||
|
|
||||||
Plate_Analysis_Delta_Backgrd <-
|
Plate_Analysis_Delta_Backgrd <-
|
||||||
ggplot(X, aes(Scan, Delta_Backgrd, color = as.factor(Conc_Num))) +
|
ggplot(X, aes(Scan, Delta_Backgrd, color = as.factor(Conc_Num))) +
|
||||||
@@ -417,13 +421,13 @@ X_Delta_Backgrd_above_Tolerance_L_vs_K <-
|
|||||||
) +
|
) +
|
||||||
theme_publication_legend_right()
|
theme_publication_legend_right()
|
||||||
|
|
||||||
pdf(paste(outDir_QC, "Raw_L_vs_K_for_strains_above_deltabackgrd_threshold.pdf", sep = ""), width = 12, height = 8)
|
pdf(file.path(outDir_QC, "Raw_L_vs_K_for_strains_above_deltabackgrd_threshold.pdf"), width = 12, height = 8)
|
||||||
X_Delta_Backgrd_above_Tolerance_L_vs_K
|
X_Delta_Backgrd_above_Tolerance_L_vs_K
|
||||||
|
|
||||||
dev.off()
|
dev.off()
|
||||||
|
|
||||||
pgg <- ggplotly(X_Delta_Backgrd_above_Tolerance_L_vs_K)
|
pgg <- ggplotly(X_Delta_Backgrd_above_Tolerance_L_vs_K)
|
||||||
plotly_path <- paste(outDir_QC, "Raw_L_vs_K_for_strains_above_deltabackgrd_threshold.html", sep = "")
|
plotly_path <- file.path(outDir_QC, "Raw_L_vs_K_for_strains_above_deltabackgrd_threshold.html")
|
||||||
saveWidget(pgg, file = plotly_path, selfcontained = TRUE)
|
saveWidget(pgg, file = plotly_path, selfcontained = TRUE)
|
||||||
|
|
||||||
# Frequency plot for all data vs. the delta_background
|
# Frequency plot for all data vs. the delta_background
|
||||||
@@ -434,7 +438,7 @@ DeltaBackground_Frequency_Plot <- ggplot(X, aes(Delta_Backgrd, color = as.factor
|
|||||||
DeltaBackground_Bar_Plot <- ggplot(X, aes(Delta_Backgrd, color = as.factor(Conc_Num))) + geom_bar() +
|
DeltaBackground_Bar_Plot <- ggplot(X, aes(Delta_Backgrd, color = as.factor(Conc_Num))) + geom_bar() +
|
||||||
ggtitle("Bar plot for Delta Background by Conc All Data") + theme_publication_legend_right()
|
ggtitle("Bar plot for Delta Background by Conc All Data") + theme_publication_legend_right()
|
||||||
|
|
||||||
pdf(file = paste(outDir_QC, "Frequency_Delta_Background.pdf", sep = ""), width = 12, height = 8)
|
pdf(file.path(outDir_QC, "Frequency_Delta_Background.pdf"), width = 12, height = 8)
|
||||||
print(DeltaBackground_Frequency_Plot)
|
print(DeltaBackground_Frequency_Plot)
|
||||||
print(DeltaBackground_Bar_Plot)
|
print(DeltaBackground_Bar_Plot)
|
||||||
dev.off()
|
dev.off()
|
||||||
@@ -568,7 +572,7 @@ Plate_Analysis_Delta_Backgrd_Box_afterQC <-
|
|||||||
theme_publication()
|
theme_publication()
|
||||||
|
|
||||||
# Print the plate analysis data before and after QC
|
# Print the plate analysis data before and after QC
|
||||||
pdf(file = paste(outDir_QC, "Plate_Analysis.pdf", sep = ""), width = 14, height = 9)
|
pdf(file.path(outDir_QC, "Plate_Analysis.pdf"), width = 14, height = 9)
|
||||||
Plate_Analysis_L
|
Plate_Analysis_L
|
||||||
Plate_Analysis_L_afterQC
|
Plate_Analysis_L_afterQC
|
||||||
Plate_Analysis_K
|
Plate_Analysis_K
|
||||||
@@ -582,7 +586,7 @@ Plate_Analysis_Delta_Backgrd_afterQC
|
|||||||
dev.off()
|
dev.off()
|
||||||
|
|
||||||
# Print the plate analysis data before and after QC
|
# Print the plate analysis data before and after QC
|
||||||
pdf(file = paste(outDir_QC, "Plate_Analysis_Boxplots.pdf", sep = ""), width = 18, height = 9)
|
pdf(file.path(outDir_QC, "Plate_Analysis_Boxplots.pdf"), width = 18, height = 9)
|
||||||
Plate_Analysis_L_Box
|
Plate_Analysis_L_Box
|
||||||
Plate_Analysis_L_Box_afterQC
|
Plate_Analysis_L_Box_afterQC
|
||||||
Plate_Analysis_K_Box
|
Plate_Analysis_K_Box
|
||||||
@@ -713,7 +717,7 @@ Plate_Analysis_Delta_Backgrd_Box_afterQC_Z <-
|
|||||||
theme_publication()
|
theme_publication()
|
||||||
|
|
||||||
# Print the plate analysis data before and after QC
|
# Print the plate analysis data before and after QC
|
||||||
pdf(file = paste(outDir_QC, "Plate_Analysis_noZeros.pdf", sep = ""), width = 14, height = 9)
|
pdf(file.path(outDir_QC, "Plate_Analysis_noZeros.pdf"), width = 14, height = 9)
|
||||||
Plate_Analysis_L_afterQC_Z
|
Plate_Analysis_L_afterQC_Z
|
||||||
Plate_Analysis_K_afterQC_Z
|
Plate_Analysis_K_afterQC_Z
|
||||||
Plate_Analysis_r_afterQC_Z
|
Plate_Analysis_r_afterQC_Z
|
||||||
@@ -722,7 +726,7 @@ Plate_Analysis_Delta_Backgrd_afterQC_Z
|
|||||||
dev.off()
|
dev.off()
|
||||||
|
|
||||||
# Print the plate analysis data before and after QC
|
# Print the plate analysis data before and after QC
|
||||||
pdf(file = paste(outDir_QC, "Plate_Analysis_noZeros_Boxplots.pdf", sep = ""), width = 18, height = 9)
|
pdf(file.path(outDir_QC, "Plate_Analysis_noZeros_Boxplots.pdf"), width = 18, height = 9)
|
||||||
Plate_Analysis_L_Box_afterQC_Z
|
Plate_Analysis_L_Box_afterQC_Z
|
||||||
Plate_Analysis_K_Box_afterQC_Z
|
Plate_Analysis_K_Box_afterQC_Z
|
||||||
Plate_Analysis_r_Box_afterQC_Z
|
Plate_Analysis_r_Box_afterQC_Z
|
||||||
@@ -768,7 +772,7 @@ X_stats_ALL <- ddply(
|
|||||||
)
|
)
|
||||||
|
|
||||||
# print(X_stats_ALL_L)
|
# print(X_stats_ALL_L)
|
||||||
write.csv(X_stats_ALL, file = paste(outDir, "SummaryStats_ALLSTRAINS.csv"), row.names = FALSE)
|
write.csv(X_stats_ALL, file.path(outDir, "SummaryStats_ALLSTRAINS.csv"), row.names = FALSE)
|
||||||
|
|
||||||
# Part 3 - Generate summary statistics and calculate the max theoretical L value
|
# Part 3 - Generate summary statistics and calculate the max theoretical L value
|
||||||
# Calculate the Z score at each drug conc for each deletion strain
|
# Calculate the Z score at each drug conc for each deletion strain
|
||||||
@@ -885,7 +889,7 @@ for (s in Background_Strains) {
|
|||||||
se_AUC = sd_AUC / sqrt(N - 1)
|
se_AUC = sd_AUC / sqrt(N - 1)
|
||||||
)
|
)
|
||||||
|
|
||||||
write.csv(X_stats_BY, file = paste(outDir, "SummaryStats_BackgroundStrains.csv"), row.names = FALSE)
|
write.csv(X_stats_BY, file.path(outDir, "SummaryStats_BackgroundStrains.csv"), row.names = FALSE)
|
||||||
|
|
||||||
# Calculate the max theoretical L values
|
# Calculate the max theoretical L values
|
||||||
# Only look for max values when K is within 2SD of the ref strain
|
# Only look for max values when K is within 2SD of the ref strain
|
||||||
@@ -964,7 +968,7 @@ for (s in Background_Strains) {
|
|||||||
|
|
||||||
write.csv(
|
write.csv(
|
||||||
X_stats_BY_L_within_2SD_K,
|
X_stats_BY_L_within_2SD_K,
|
||||||
file = paste(outDir_QC, "Max_Observed_L_Vals_for_spots_within_2SD_K.csv", sep = ""),
|
file.path(outDir_QC, "Max_Observed_L_Vals_for_spots_within_2SD_K.csv"),
|
||||||
row.names = FALSE
|
row.names = FALSE
|
||||||
)
|
)
|
||||||
|
|
||||||
@@ -991,13 +995,13 @@ for (s in Background_Strains) {
|
|||||||
ggtitle("Raw L vs K for strains falling outside 2SD of the K mean at each conc") +
|
ggtitle("Raw L vs K for strains falling outside 2SD of the K mean at each conc") +
|
||||||
theme_publication_legend_right()
|
theme_publication_legend_right()
|
||||||
|
|
||||||
pdf(paste(outDir_QC, "Raw_L_vs_K_for_strains_2SD_outside_mean_K.pdf", sep = ""), width = 10, height = 8)
|
pdf(file.path(outDir_QC, "Raw_L_vs_K_for_strains_2SD_outside_mean_K.pdf"), width = 10, height = 8)
|
||||||
print(Outside_2SD_K_L_vs_K)
|
print(Outside_2SD_K_L_vs_K)
|
||||||
|
|
||||||
dev.off()
|
dev.off()
|
||||||
|
|
||||||
pgg <- ggplotly(Outside_2SD_K_L_vs_K)
|
pgg <- ggplotly(Outside_2SD_K_L_vs_K)
|
||||||
plotly_path <- paste(outDir_QC, "RawL_vs_K_for_strains_outside_2SD_K.html", sep = "")
|
plotly_path <- file.path(outDir_QC, "RawL_vs_K_for_strains_outside_2SD_K.html")
|
||||||
saveWidget(pgg, file = plotly_path, selfcontained = TRUE)
|
saveWidget(pgg, file = plotly_path, selfcontained = TRUE)
|
||||||
|
|
||||||
Outside_2SD_K_delta_background_vs_K <-
|
Outside_2SD_K_delta_background_vs_K <-
|
||||||
@@ -1006,13 +1010,13 @@ for (s in Background_Strains) {
|
|||||||
ggtitle("DeltaBackground vs K for strains falling outside 2SD of the K mean at each conc") +
|
ggtitle("DeltaBackground vs K for strains falling outside 2SD of the K mean at each conc") +
|
||||||
theme_publication_legend_right()
|
theme_publication_legend_right()
|
||||||
|
|
||||||
pdf(paste(outDir_QC, "DeltaBackground_vs_K_for_strains_2SD_outside_mean_K.pdf", sep = ""), width = 10, height = 8)
|
pdf(file.path(outDir_QC, "DeltaBackground_vs_K_for_strains_2SD_outside_mean_K.pdf"), width = 10, height = 8)
|
||||||
Outside_2SD_K_delta_background_vs_K
|
Outside_2SD_K_delta_background_vs_K
|
||||||
dev.off()
|
dev.off()
|
||||||
pgg <- ggplotly(Outside_2SD_K_delta_background_vs_K)
|
pgg <- ggplotly(Outside_2SD_K_delta_background_vs_K)
|
||||||
|
|
||||||
# pgg
|
# pgg
|
||||||
plotly_path <- paste(outDir_QC, "DeltaBackground_vs_K_for_strains_outside_2SD_K.html", sep = "")
|
plotly_path <- file.path(outDir_QC, "DeltaBackground_vs_K_for_strains_outside_2SD_K.html")
|
||||||
saveWidget(pgg, file = plotly_path, selfcontained = TRUE)
|
saveWidget(pgg, file = plotly_path, selfcontained = TRUE)
|
||||||
|
|
||||||
# Get the background strain mean values at the no drug conc to calculate shift
|
# Get the background strain mean values at the no drug conc to calculate shift
|
||||||
@@ -1046,7 +1050,7 @@ for (s in Background_Strains) {
|
|||||||
X2_temp <- X2[X2$Conc_Num == Concentration, ]
|
X2_temp <- X2[X2$Conc_Num == Concentration, ]
|
||||||
if (Concentration == 0) {
|
if (Concentration == 0) {
|
||||||
X2_new <- X2_temp
|
X2_new <- X2_temp
|
||||||
print(paste("Check loop order, conc =", Concentration, sep = " "))
|
sprintf("Check loop order, conc = %f", Concentration)
|
||||||
}
|
}
|
||||||
if (Concentration > 0) {
|
if (Concentration > 0) {
|
||||||
try(X2_temp[X2_temp$l == 0 & !is.na(X2_temp$l), ]$l <- X_stats_BY_L_within_2SD_K$max[i])
|
try(X2_temp[X2_temp$l == 0 & !is.na(X2_temp$l), ]$l <- X_stats_BY_L_within_2SD_K$max[i])
|
||||||
@@ -1055,7 +1059,7 @@ for (s in Background_Strains) {
|
|||||||
# X2_temp[X2_temp$K == 0, ]$K <- X_stats_ALL_K$max[i]
|
# X2_temp[X2_temp$K == 0, ]$K <- X_stats_ALL_K$max[i]
|
||||||
# X2_temp[X2_temp$r == 0, ]$r <- X_stats_ALL_r$max[i]
|
# X2_temp[X2_temp$r == 0, ]$r <- X_stats_ALL_r$max[i]
|
||||||
# X2_temp[X2_temp$AUC == 0, ]$AUC <- X_stats_ALL_AUC$max[i]
|
# X2_temp[X2_temp$AUC == 0, ]$AUC <- X_stats_ALL_AUC$max[i]
|
||||||
print(paste("Check loop order, conc =", Concentration, sep = " "))
|
sprintf("Check loop order, conc = %f", Concentration)
|
||||||
X2_new <- rbind(X2_new, X2_temp)
|
X2_new <- rbind(X2_new, X2_temp)
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@@ -1075,13 +1079,13 @@ for (s in Background_Strains) {
|
|||||||
X2_RF_temp <- X2_RF[X2_RF$Conc_Num == Concentration, ]
|
X2_RF_temp <- X2_RF[X2_RF$Conc_Num == Concentration, ]
|
||||||
if (Concentration == 0) {
|
if (Concentration == 0) {
|
||||||
X2_RF_new <- X2_RF_temp
|
X2_RF_new <- X2_RF_temp
|
||||||
print(paste("Check loop order, conc =", Concentration, sep = " "))
|
sprintf("Check loop order, conc = %f", Concentration)
|
||||||
}
|
}
|
||||||
if (Concentration > 0) {
|
if (Concentration > 0) {
|
||||||
try(X2_RF_temp[X2_RF_temp$l == 0 & !is.na(X2_RF_temp$l), ]$l <- X_stats_BY_L_within_2SD_K$max[i])
|
try(X2_RF_temp[X2_RF_temp$l == 0 & !is.na(X2_RF_temp$l), ]$l <- X_stats_BY_L_within_2SD_K$max[i])
|
||||||
try(X2_temp[X2_temp$l >= X_stats_BY_L_within_2SD_K$max[i] & !is.na(X2_temp$l), ]$SM <- 1)
|
try(X2_temp[X2_temp$l >= X_stats_BY_L_within_2SD_K$max[i] & !is.na(X2_temp$l), ]$SM <- 1)
|
||||||
try(X2_RF_temp[X2_RF_temp$l >= X_stats_BY_L_within_2SD_K$max[i] & !is.na(X2_RF_temp$l), ]$l <- X_stats_BY_L_within_2SD_K$max[i])
|
try(X2_RF_temp[X2_RF_temp$l >= X_stats_BY_L_within_2SD_K$max[i] & !is.na(X2_RF_temp$l), ]$l <- X_stats_BY_L_within_2SD_K$max[i])
|
||||||
print(paste("If error, refs have no L values outside theoretical max L, for REFs, conc =", Concentration, sep = " "))
|
sprintf("If error, refs have no L values outside theoretical max L, for REFs, conc = %f", Concentration)
|
||||||
X2_RF_new <- rbind(X2_RF_new, X2_RF_temp)
|
X2_RF_new <- rbind(X2_RF_new, X2_RF_temp)
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@@ -1446,7 +1450,7 @@ for (s in Background_Strains) {
|
|||||||
InteractionScores_RF$Z_lm_AUC <- (InteractionScores_RF$lm_Score_AUC - lm_mean_AUC) / (lm_sd_AUC)
|
InteractionScores_RF$Z_lm_AUC <- (InteractionScores_RF$lm_Score_AUC - lm_mean_AUC) / (lm_sd_AUC)
|
||||||
InteractionScores_RF <- InteractionScores_RF[order(InteractionScores_RF$Z_lm_L, decreasing = TRUE), ]
|
InteractionScores_RF <- InteractionScores_RF[order(InteractionScores_RF$Z_lm_L, decreasing = TRUE), ]
|
||||||
InteractionScores_RF <- InteractionScores_RF[order(InteractionScores_RF$NG, decreasing = TRUE), ]
|
InteractionScores_RF <- InteractionScores_RF[order(InteractionScores_RF$NG, decreasing = TRUE), ]
|
||||||
write.csv(InteractionScores_RF, paste(outDir, "RF_ZScores_Interaction.csv", sep = ""), row.names = FALSE)
|
write.csv(InteractionScores_RF, file.path(outDir, "RF_ZScores_Interaction.csv"), row.names = FALSE)
|
||||||
|
|
||||||
for (i in 1:num_genes_RF) {
|
for (i in 1:num_genes_RF) {
|
||||||
Gene_Sel <- unique(InteractionScores_RF$OrfRep)[i]
|
Gene_Sel <- unique(InteractionScores_RF$OrfRep)[i]
|
||||||
@@ -1548,7 +1552,7 @@ for (s in Background_Strains) {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
print("Pass RF ggplot loop")
|
print("Pass RF ggplot loop")
|
||||||
write.csv(X_stats_interaction_ALL_RF_final, paste(outDir, "RF_ZScore_Calculations.csv", sep = ""), row.names = FALSE)
|
write.csv(X_stats_interaction_ALL_RF_final, file.path(outDir, "RF_ZScore_Calculations.csv"), row.names = FALSE)
|
||||||
|
|
||||||
# Part 5 - Get Zscores for Gene deletion strains
|
# Part 5 - Get Zscores for Gene deletion strains
|
||||||
|
|
||||||
@@ -1904,7 +1908,7 @@ for (s in Background_Strains) {
|
|||||||
InteractionScores <- InteractionScores[order(InteractionScores$NG, decreasing = TRUE), ]
|
InteractionScores <- InteractionScores[order(InteractionScores$NG, decreasing = TRUE), ]
|
||||||
df_order_by_OrfRep <- unique(InteractionScores$OrfRep)
|
df_order_by_OrfRep <- unique(InteractionScores$OrfRep)
|
||||||
# X_stats_interaction_ALL <- X_stats_interaction_ALL[order(X_stats_interaction_ALL$NG, decreasing = TRUE), ]
|
# X_stats_interaction_ALL <- X_stats_interaction_ALL[order(X_stats_interaction_ALL$NG, decreasing = TRUE), ]
|
||||||
write.csv(InteractionScores, paste(outDir, "ZScores_Interaction.csv", sep = ""), row.names = FALSE)
|
write.csv(InteractionScores, file.path(outDir, "ZScores_Interaction.csv"), row.names = FALSE)
|
||||||
|
|
||||||
InteractionScores_deletion_enhancers_L <-
|
InteractionScores_deletion_enhancers_L <-
|
||||||
InteractionScores[InteractionScores$Avg_Zscore_L >= 2, ]
|
InteractionScores[InteractionScores$Avg_Zscore_L >= 2, ]
|
||||||
@@ -1957,25 +1961,25 @@ for (s in Background_Strains) {
|
|||||||
InteractionScores_deletion_enhancers_Avg_Zscore_Suppressors_lm_K[
|
InteractionScores_deletion_enhancers_Avg_Zscore_Suppressors_lm_K[
|
||||||
!is.na(InteractionScores_deletion_enhancers_Avg_Zscore_Suppressors_lm_K$OrfRep), ]
|
!is.na(InteractionScores_deletion_enhancers_Avg_Zscore_Suppressors_lm_K$OrfRep), ]
|
||||||
write.csv(InteractionScores_deletion_enhancers_L,
|
write.csv(InteractionScores_deletion_enhancers_L,
|
||||||
paste(outDir, "ZScores_Interaction_DeletionEnhancers_L.csv", sep = ""), row.names = FALSE)
|
file.path(outDir, "ZScores_Interaction_DeletionEnhancers_L.csv"), row.names = FALSE)
|
||||||
write.csv(InteractionScores_deletion_enhancers_K,
|
write.csv(InteractionScores_deletion_enhancers_K,
|
||||||
paste(outDir, "ZScores_Interaction_DeletionEnhancers_K.csv", sep = ""), row.names = FALSE)
|
file.path(outDir, "ZScores_Interaction_DeletionEnhancers_K.csv"), row.names = FALSE)
|
||||||
write.csv(InteractionScores_deletion_suppressors_L,
|
write.csv(InteractionScores_deletion_suppressors_L,
|
||||||
paste(outDir, "ZScores_Interaction_DeletionSuppressors_L.csv", sep = ""), row.names = FALSE)
|
file.path(outDir, "ZScores_Interaction_DeletionSuppressors_L.csv"), row.names = FALSE)
|
||||||
write.csv(InteractionScores_deletion_suppressors_K,
|
write.csv(InteractionScores_deletion_suppressors_K,
|
||||||
paste(outDir, "ZScores_Interaction_DeletionSuppressors_K.csv", sep = ""), row.names = FALSE)
|
file.path(outDir, "ZScores_Interaction_DeletionSuppressors_K.csv"), row.names = FALSE)
|
||||||
write.csv(InteractionScores_deletion_enhancers_and_Suppressors_L,
|
write.csv(InteractionScores_deletion_enhancers_and_Suppressors_L,
|
||||||
paste(outDir, "ZScores_Interaction_DeletionEnhancers_and_Suppressors_L.csv", sep = ""), row.names = FALSE)
|
file.path(outDir, "ZScores_Interaction_DeletionEnhancers_and_Suppressors_L.csv"), row.names = FALSE)
|
||||||
write.csv(InteractionScores_deletion_enhancers_and_Suppressors_K,
|
write.csv(InteractionScores_deletion_enhancers_and_Suppressors_K,
|
||||||
paste(outDir, "ZScores_Interaction_DeletionEnhancers_and_Suppressors_K.csv", sep = ""), row.names = FALSE)
|
file.path(outDir, "ZScores_Interaction_DeletionEnhancers_and_Suppressors_K.csv"), row.names = FALSE)
|
||||||
write.csv(InteractionScores_deletion_enhancers_lm_Suppressors_AvgZscore_L,
|
write.csv(InteractionScores_deletion_enhancers_lm_Suppressors_AvgZscore_L,
|
||||||
paste(outDir, "ZScores_Interaction_Suppressors_and_lm_Enhancers_L.csv", sep = ""), row.names = FALSE)
|
file.path(outDir, "ZScores_Interaction_Suppressors_and_lm_Enhancers_L.csv"), row.names = FALSE)
|
||||||
write.csv(InteractionScores_deletion_enhancers_Avg_Zscore_Suppressors_lm_L,
|
write.csv(InteractionScores_deletion_enhancers_Avg_Zscore_Suppressors_lm_L,
|
||||||
paste(outDir, "ZScores_Interaction_Enhancers_and_lm_Suppressors_L.csv", sep = ""), row.names = FALSE)
|
file.path(outDir, "ZScores_Interaction_Enhancers_and_lm_Suppressors_L.csv"), row.names = FALSE)
|
||||||
write.csv(InteractionScores_deletion_enhancers_lm_Suppressors_AvgZscore_K,
|
write.csv(InteractionScores_deletion_enhancers_lm_Suppressors_AvgZscore_K,
|
||||||
paste(outDir, "ZScores_Interaction_Suppressors_and_lm_Enhancers_K.csv", sep = ""), row.names = FALSE)
|
file.path(outDir, "ZScores_Interaction_Suppressors_and_lm_Enhancers_K.csv"), row.names = FALSE)
|
||||||
write.csv(InteractionScores_deletion_enhancers_Avg_Zscore_Suppressors_lm_K,
|
write.csv(InteractionScores_deletion_enhancers_Avg_Zscore_Suppressors_lm_K,
|
||||||
paste(outDir, "ZScores_Interaction_Enhancers_and_lm_Suppressors_K.csv", sep = ""), row.names = FALSE)
|
file.path(outDir, "ZScores_Interaction_Enhancers_and_lm_Suppressors_K.csv"), row.names = FALSE)
|
||||||
|
|
||||||
# Get enhancers and suppressors for linear regression
|
# Get enhancers and suppressors for linear regression
|
||||||
InteractionScores_deletion_enhancers_L_lm <-
|
InteractionScores_deletion_enhancers_L_lm <-
|
||||||
@@ -2010,19 +2014,19 @@ for (s in Background_Strains) {
|
|||||||
!is.na(InteractionScores_deletion_enhancers_and_Suppressors_K_lm$OrfRep), ]
|
!is.na(InteractionScores_deletion_enhancers_and_Suppressors_K_lm$OrfRep), ]
|
||||||
|
|
||||||
write.csv(InteractionScores_deletion_enhancers_L_lm,
|
write.csv(InteractionScores_deletion_enhancers_L_lm,
|
||||||
paste(outDir, "ZScores_Interaction_DeletionEnhancers_L_lm.csv", sep = ""), row.names = FALSE)
|
file.path(outDir, "ZScores_Interaction_DeletionEnhancers_L_lm.csv"), row.names = FALSE)
|
||||||
write.csv(InteractionScores_deletion_enhancers_K_lm,
|
write.csv(InteractionScores_deletion_enhancers_K_lm,
|
||||||
paste(outDir, "ZScores_Interaction_DeletionEnhancers_K_lm.csv", sep = ""), row.names = FALSE)
|
file.path(outDir, "ZScores_Interaction_DeletionEnhancers_K_lm.csv"), row.names = FALSE)
|
||||||
write.csv(InteractionScores_deletion_suppressors_L_lm,
|
write.csv(InteractionScores_deletion_suppressors_L_lm,
|
||||||
paste(outDir, "ZScores_Interaction_DeletionSuppressors_L_lm.csv", sep = ""), row.names = FALSE)
|
file.path(outDir, "ZScores_Interaction_DeletionSuppressors_L_lm.csv"), row.names = FALSE)
|
||||||
write.csv(InteractionScores_deletion_suppressors_K_lm,
|
write.csv(InteractionScores_deletion_suppressors_K_lm,
|
||||||
paste(outDir, "ZScores_Interaction_DeletionSuppressors_K_lm.csv", sep = ""), row.names = FALSE)
|
file.path(outDir, "ZScores_Interaction_DeletionSuppressors_K_lm.csv"), row.names = FALSE)
|
||||||
write.csv(InteractionScores_deletion_enhancers_and_Suppressors_L_lm,
|
write.csv(InteractionScores_deletion_enhancers_and_Suppressors_L_lm,
|
||||||
paste(outDir, "ZScores_Interaction_DeletionEnhancers_and_Suppressors_L_lm.csv", sep = ""), row.names = FALSE)
|
file.path(outDir, "ZScores_Interaction_DeletionEnhancers_and_Suppressors_L_lm.csv"), row.names = FALSE)
|
||||||
write.csv(InteractionScores_deletion_enhancers_and_Suppressors_K_lm,
|
write.csv(InteractionScores_deletion_enhancers_and_Suppressors_K_lm,
|
||||||
paste(outDir, "ZScores_Interaction_DeletionEnhancers_and_Suppressors_K_lm.csv", sep = ""), row.names = FALSE)
|
file.path(outDir, "ZScores_Interaction_DeletionEnhancers_and_Suppressors_K_lm.csv"), row.names = FALSE)
|
||||||
write.csv(Labels, file = paste(studyInfo), row.names = FALSE)
|
# write.csv(Labels, studyInfo, row.names = FALSE)
|
||||||
# write.table(Labels, file = paste("../Code/StudyInfo.txt"), sep = "\t", row.names = FALSE)
|
# write.table(Labels, file.path("../Code/StudyInfo.txt"), sep = "\t", row.names = FALSE)
|
||||||
|
|
||||||
for (i in 1:num_genes) {
|
for (i in 1:num_genes) {
|
||||||
Gene_Sel <- unique(InteractionScores$OrfRep)[i]
|
Gene_Sel <- unique(InteractionScores$OrfRep)[i]
|
||||||
@@ -2113,11 +2117,11 @@ for (s in Background_Strains) {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
print("Pass Int ggplot loop")
|
print("Pass Int ggplot loop")
|
||||||
write.csv(X_stats_interaction_ALL_final, paste(outDir, "ZScore_Calculations.csv", sep = ""), row.names = FALSE)
|
write.csv(X_stats_interaction_ALL_final, file.path(outDir, "ZScore_Calculations.csv"), row.names = FALSE)
|
||||||
|
|
||||||
Blank <- ggplot(X2_RF) + geom_blank()
|
Blank <- ggplot(X2_RF) + geom_blank()
|
||||||
|
|
||||||
pdf(paste(outDir, "InteractionPlots.pdf", sep = ""), width = 16, height = 16, onefile = TRUE)
|
pdf(file.path(outDir, "InteractionPlots.pdf"), width = 16, height = 16, onefile = TRUE)
|
||||||
|
|
||||||
X_stats_X2_RF <- ddply(
|
X_stats_X2_RF <- ddply(
|
||||||
X2_RF,
|
X2_RF,
|
||||||
@@ -2344,7 +2348,7 @@ for (s in Background_Strains) {
|
|||||||
|
|
||||||
dev.off()
|
dev.off()
|
||||||
|
|
||||||
pdf(paste(outDir, "RF_InteractionPlots.pdf", sep = ""), width = 16, height = 16, onefile = TRUE)
|
pdf(file.path(outDir, "RF_InteractionPlots.pdf"), width = 16, height = 16, onefile = TRUE)
|
||||||
|
|
||||||
X_stats_X2_RF <- ddply(
|
X_stats_X2_RF <- ddply(
|
||||||
X2_RF,
|
X2_RF,
|
||||||
@@ -2732,7 +2736,7 @@ for (s in Background_Strains) {
|
|||||||
geom_hline(yintercept = c(-3, 3)) + geom_point(size = 0.1, shape = 3) +
|
geom_hline(yintercept = c(-3, 3)) + geom_point(size = 0.1, shape = 3) +
|
||||||
theme_publication()
|
theme_publication()
|
||||||
|
|
||||||
pdf(paste(outDir, "RankPlots.pdf", sep = ""), width = 18, height = 12, onefile = TRUE)
|
pdf(file.path(outDir, "RankPlots.pdf"), width = 18, height = 12, onefile = TRUE)
|
||||||
|
|
||||||
grid.arrange(
|
grid.arrange(
|
||||||
Rank_L_1SD,
|
Rank_L_1SD,
|
||||||
@@ -2866,7 +2870,7 @@ for (s in Background_Strains) {
|
|||||||
geom_hline(yintercept = c(-3, 3)) + geom_point(size = 0.1, shape = 3) +
|
geom_hline(yintercept = c(-3, 3)) + geom_point(size = 0.1, shape = 3) +
|
||||||
theme_publication()
|
theme_publication()
|
||||||
|
|
||||||
pdf(paste(outDir, "RankPlots_lm.pdf", sep = ""), width = 18, height = 12, onefile = TRUE)
|
pdf(file.path(outDir, "RankPlots_lm.pdf"), width = 18, height = 12, onefile = TRUE)
|
||||||
|
|
||||||
grid.arrange(
|
grid.arrange(
|
||||||
Rank_L_1SD_lm,
|
Rank_L_1SD_lm,
|
||||||
@@ -2916,7 +2920,7 @@ for (s in Background_Strains) {
|
|||||||
get_lm_AUC <- lm(X_NArm$Z_lm_AUC ~ X_NArm$Avg_Zscore_AUC)
|
get_lm_AUC <- lm(X_NArm$Z_lm_AUC ~ X_NArm$Avg_Zscore_AUC)
|
||||||
AUC_lm <- summary(get_lm_AUC)
|
AUC_lm <- summary(get_lm_AUC)
|
||||||
|
|
||||||
pdf(paste(outDir, "Avg_Zscore_vs_lm_NA_rm.pdf", sep = ""), width = 16, height = 12, onefile = TRUE)
|
pdf(file.path(outDir, "Avg_Zscore_vs_lm_NA_rm.pdf"), width = 16, height = 12, onefile = TRUE)
|
||||||
|
|
||||||
print(ggplot(X_NArm, aes(Avg_Zscore_L, Z_lm_L)) +
|
print(ggplot(X_NArm, aes(Avg_Zscore_L, Z_lm_L)) +
|
||||||
geom_point(aes(color = Overlap), shape = 3) +
|
geom_point(aes(color = Overlap), shape = 3) +
|
||||||
@@ -2955,7 +2959,7 @@ for (s in Background_Strains) {
|
|||||||
annotate("text", x = 0, y = 0, label = paste("R-squared = ", round(L_lm$r.squared, 2))) + theme_publication_legend_right()
|
annotate("text", x = 0, y = 0, label = paste("R-squared = ", round(L_lm$r.squared, 2))) + theme_publication_legend_right()
|
||||||
|
|
||||||
pgg <- ggplotly(lm_v_Zscore_L)
|
pgg <- ggplotly(lm_v_Zscore_L)
|
||||||
plotly_path <- paste(outDir, "Avg_Zscore_vs_lm_NA_rm.html", sep = "")
|
plotly_path <- file.path(outDir, "Avg_Zscore_vs_lm_NA_rm.html")
|
||||||
saveWidget(pgg, file = plotly_path, selfcontained = TRUE)
|
saveWidget(pgg, file = plotly_path, selfcontained = TRUE)
|
||||||
|
|
||||||
X_NArm$L_Rank <- rank(X_NArm$Avg_Zscore_L)
|
X_NArm$L_Rank <- rank(X_NArm$Avg_Zscore_L)
|
||||||
@@ -2978,7 +2982,7 @@ for (s in Background_Strains) {
|
|||||||
AUC_lm2 <- summary(get_lm_AUC2)
|
AUC_lm2 <- summary(get_lm_AUC2)
|
||||||
num_genes_NArm2 <- (dim(X_NArm)[1]) / 2
|
num_genes_NArm2 <- (dim(X_NArm)[1]) / 2
|
||||||
|
|
||||||
pdf(paste(outDir, "Avg_Zscore_vs_lm_ranked_NA_rm.pdf", sep = ""), width = 16, height = 12, onefile = TRUE)
|
pdf(file.path(outDir, "Avg_Zscore_vs_lm_ranked_NA_rm.pdf"), width = 16, height = 12, onefile = TRUE)
|
||||||
|
|
||||||
print(
|
print(
|
||||||
ggplot(X_NArm, aes(L_Rank, L_Rank_lm)) +
|
ggplot(X_NArm, aes(L_Rank, L_Rank_lm)) +
|
||||||
@@ -3114,7 +3118,7 @@ for (s in Background_Strains) {
|
|||||||
geom_hline(yintercept = c(-3, 3)) + geom_point(size = 0.1, shape = 3) +
|
geom_hline(yintercept = c(-3, 3)) + geom_point(size = 0.1, shape = 3) +
|
||||||
theme_publication()
|
theme_publication()
|
||||||
|
|
||||||
pdf(paste(outDir, "RankPlots_naRM.pdf", sep = ""), width = 18, height = 12, onefile = TRUE)
|
pdf(file.path(outDir, "RankPlots_naRM.pdf"), width = 18, height = 12, onefile = TRUE)
|
||||||
|
|
||||||
grid.arrange(
|
grid.arrange(
|
||||||
Rank_L_1SD,
|
Rank_L_1SD,
|
||||||
@@ -3234,7 +3238,7 @@ for (s in Background_Strains) {
|
|||||||
geom_hline(yintercept = c(-3, 3)) + geom_point(size = 0.1, shape = 3) +
|
geom_hline(yintercept = c(-3, 3)) + geom_point(size = 0.1, shape = 3) +
|
||||||
theme_publication()
|
theme_publication()
|
||||||
|
|
||||||
pdf(paste(outDir, "RankPlots_lm_naRM.pdf", sep = ""), width = 18, height = 12, onefile = TRUE)
|
pdf(file.path(outDir, "RankPlots_lm_naRM.pdf"), width = 18, height = 12, onefile = TRUE)
|
||||||
|
|
||||||
grid.arrange(
|
grid.arrange(
|
||||||
Rank_L_1SD_lm,
|
Rank_L_1SD_lm,
|
||||||
@@ -3273,7 +3277,7 @@ L_lm_5 <- summary(get_lm_5)
|
|||||||
get_lm_6 <- lm(X_NArm$Z_lm_AUC ~ X_NArm$Z_lm_r)
|
get_lm_6 <- lm(X_NArm$Z_lm_AUC ~ X_NArm$Z_lm_r)
|
||||||
L_lm_6 <- summary(get_lm_6)
|
L_lm_6 <- summary(get_lm_6)
|
||||||
|
|
||||||
pdf(file = paste(outDir, "Correlation_CPPs.pdf", sep = ""), width = 10, height = 7, onefile = TRUE)
|
pdf(file.path(outDir, "Correlation_CPPs.pdf"), width = 10, height = 7, onefile = TRUE)
|
||||||
|
|
||||||
ggplot(X_NArm, aes(Z_lm_L, Z_lm_K)) +
|
ggplot(X_NArm, aes(Z_lm_L, Z_lm_K)) +
|
||||||
geom_point(shape = 3, color = "gray70") +
|
geom_point(shape = 3, color = "gray70") +
|
||||||
@@ -3482,5 +3486,5 @@ ggplot(X_NArm, aes(Z_lm_r, Z_lm_AUC)) +
|
|||||||
|
|
||||||
dev.off()
|
dev.off()
|
||||||
|
|
||||||
# write.csv(Labels, file = paste("../Code/Parameters.csv"), row.names = FALSE)
|
# write.csv(Labels, file.path("../Code/Parameters.csv"), row.names = FALSE)
|
||||||
timestamp()
|
timestamp()
|
||||||
|
|||||||
@@ -14,7 +14,7 @@
|
|||||||
# Insert a general description of Q-HTCP and the Q-HTCP process here.
|
# Insert a general description of Q-HTCP and the Q-HTCP process here.
|
||||||
|
|
||||||
shopt -s extglob # Turn on extended globbing
|
shopt -s extglob # Turn on extended globbing
|
||||||
DEBUG=1 # Turn debugging ON by default during development
|
DEBUG=${DEBUG:-1} # Turn debugging ON by default during development
|
||||||
|
|
||||||
# @description Use `--help` to print the help message.
|
# @description Use `--help` to print the help message.
|
||||||
# @internal
|
# @internal
|
||||||
@@ -25,13 +25,13 @@ print_help() {
|
|||||||
|
|
||||||
cat <<-EOF
|
cat <<-EOF
|
||||||
USAGE:
|
USAGE:
|
||||||
script-run-workflow [[OPTION] [VALUE]]...
|
qhtcp-workflow [[OPTION] [VALUE]]...
|
||||||
|
|
||||||
Some options (--project, --include, --exclude) can be passed multiple times or
|
Some options (--project, --include, --exclude) can be passed multiple times or
|
||||||
by using comma-separated strings (see EXAMPLES below)
|
by using comma-separated strings (see EXAMPLES below)
|
||||||
|
|
||||||
OPTIONS:
|
OPTIONS:
|
||||||
--project, -p PROJECT
|
--project, -p PROJECT[,PROJECT...]
|
||||||
PROJECT should follow the pattern ${PROJECT_PREFIX}_PROJECT_NAME
|
PROJECT should follow the pattern ${PROJECT_PREFIX}_PROJECT_NAME
|
||||||
--module, -m MODULE[,MODULE...]
|
--module, -m MODULE[,MODULE...]
|
||||||
See MODULES section below for list of available modules
|
See MODULES section below for list of available modules
|
||||||
@@ -65,13 +65,13 @@ print_help() {
|
|||||||
BiocManager: ${depends_bioc[@]}
|
BiocManager: ${depends_bioc[@]}
|
||||||
|
|
||||||
EXAMPLES:
|
EXAMPLES:
|
||||||
script-run-workflow --module ${ALL_MODULES[0]} --module ${ALL_MODULES[1]}
|
qhtcp-workflow --module ${ALL_MODULES[0]} --module ${ALL_MODULES[1]}
|
||||||
script-run-workflow --project ${PROJECT_PREFIX}_MY_PROJECT --module ${ALL_MODULES[0]} --module ${ALL_MODULES[1]}
|
qhtcp-workflow --project ${PROJECT_PREFIX}_MY_PROJECT --module ${ALL_MODULES[0]} --module ${ALL_MODULES[1]}
|
||||||
script-run-workflow --project ${PROJECT_PREFIX}_MY_PROJECT --module ${ALL_MODULES[1]} --module ${ALL_MODULES[2]} --yes
|
qhtcp-workflow --project ${PROJECT_PREFIX}_MY_PROJECT --module ${ALL_MODULES[1]} --module ${ALL_MODULES[2]} --yes
|
||||||
script-run-workflow --module=${ALL_MODULES[0]},${ALL_MODULES[1]}
|
qhtcp-workflow --module=${ALL_MODULES[0]},${ALL_MODULES[1]}
|
||||||
script-run-workflow --project=${PROJECT_PREFIX}_MY_PROJECT,${PROJECT_PREFIX}_MY_OTHER_PROJECT
|
qhtcp-workflow --project=${PROJECT_PREFIX}_MY_PROJECT,${PROJECT_PREFIX}_MY_OTHER_PROJECT
|
||||||
script-run-workflow --project=${PROJECT_PREFIX}_MY_PROJECT,${PROJECT_PREFIX}_MY_OTHER_PROJECT --module=${ALL_MODULES[1]},${ALL_MODULES[2]} --yes --debug
|
qhtcp-workflow --project=${PROJECT_PREFIX}_MY_PROJECT,${PROJECT_PREFIX}_MY_OTHER_PROJECT --module=${ALL_MODULES[1]},${ALL_MODULES[2]} --yes --debug
|
||||||
script-run-workflow --project=${PROJECT_PREFIX}_MY_PROJECT --wrapper ${ALL_WRAPPERS[2]} \"/path/to/genefile.txt,/path/to/output/dir\" --wrapper ${ALL_WRAPPERS[3]} \"/path/to/sgofile\"
|
qhtcp-workflow --project=${PROJECT_PREFIX}_MY_PROJECT --wrapper ${ALL_WRAPPERS[2]} \"/path/to/genefile.txt,/path/to/output/dir\" --wrapper ${ALL_WRAPPERS[3]} \"/path/to/sgofile\"
|
||||||
EOF
|
EOF
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -506,6 +506,7 @@ interactive_header() {
|
|||||||
printf "%d. %s\n" $((i+1)) "${ALL_WRAPPERS[i]}"
|
printf "%d. %s\n" $((i+1)) "${ALL_WRAPPERS[i]}"
|
||||||
done
|
done
|
||||||
fi
|
fi
|
||||||
|
echo ""
|
||||||
|
|
||||||
# Let user choose or add project(s)
|
# Let user choose or add project(s)
|
||||||
if [[ ${#PROJECTS[@]} -eq 0 ]]; then
|
if [[ ${#PROJECTS[@]} -eq 0 ]]; then
|
||||||
@@ -630,7 +631,7 @@ module install_dependencies
|
|||||||
#
|
#
|
||||||
# #### Perl
|
# #### Perl
|
||||||
#
|
#
|
||||||
# * `cpan File::Map ExtUtils::PkgConfig GD GO::TermFinder`
|
# * `cpan -I -i File::Map ExtUtils::PkgConfig GD GO::TermFinder`
|
||||||
#
|
#
|
||||||
# #### R
|
# #### R
|
||||||
#
|
#
|
||||||
@@ -643,10 +644,14 @@ install_dependencies() {
|
|||||||
debug "Running: ${FUNCNAME[0]} $*"
|
debug "Running: ${FUNCNAME[0]} $*"
|
||||||
|
|
||||||
# Dependency arrays
|
# Dependency arrays
|
||||||
depends_rpm=(graphviz pandoc pdftk-java gd-devel perl-CPAN shdoc nano rsync coreutils libcurl-devel openssl-devel harfbuzz-devel fribidi-devel R-core R-core-devel)
|
depends_rpm=(graphviz pandoc pdftk-java gd-devel perl-CPAN shdoc nano
|
||||||
depends_deb=(graphviz pandoc pdftk-java libgd-dev perl shdoc nano rsync coreutils libcurl-dev openssl-dev libharfbuzz-dev libfribidi-dev r-base r-base-dev)
|
rsync coreutils libcurl-devel openssl-devel harfbuzz-devel fribidi-devel
|
||||||
|
R-core R-core-devel)
|
||||||
|
depends_deb=(graphviz pandoc pdftk-java libgd-dev perl shdoc nano rsync
|
||||||
|
coreutils libcurl-dev openssl-dev libharfbuzz-dev libfribidi-dev r-base r-base-dev)
|
||||||
depends_brew=(graphiz pandoc gd pdftk-java shdoc nano perl rsync coreutils harfbuzz fribidi r)
|
depends_brew=(graphiz pandoc gd pdftk-java shdoc nano perl rsync coreutils harfbuzz fribidi r)
|
||||||
depends_perl=(Test::Warnings Test::Fatal File::Map Sub::Uplevel ExtUtils::Config ExtUtils::PkgConfig IPC::Run Module::Build::Tiny IPC::Run GD GO::TermFinder)
|
depends_perl=(Test::Warnings Test::Fatal File::Map Sub::Uplevel ExtUtils::Config
|
||||||
|
ExtUtils::PkgConfig IPC::Run Module::Build::Tiny GD GO::TermFinder)
|
||||||
depends_r=(BiocManager ontologyIndex ggrepel tidyverse sos openxlsx ggplot2
|
depends_r=(BiocManager ontologyIndex ggrepel tidyverse sos openxlsx ggplot2
|
||||||
plyr extrafont gridExtra gplots stringr plotly ggthemes pandoc rmarkdown
|
plyr extrafont gridExtra gplots stringr plotly ggthemes pandoc rmarkdown
|
||||||
plotly htmlwidgets gplots gdata)
|
plotly htmlwidgets gplots gdata)
|
||||||
@@ -1791,6 +1796,8 @@ wrapper r_interactions
|
|||||||
# * Replace 1:length() with seq_along()
|
# * Replace 1:length() with seq_along()
|
||||||
# * Re-enable disabled linter checks
|
# * Re-enable disabled linter checks
|
||||||
# * Reduce cyclomatic complexity of some of the for loops
|
# * Reduce cyclomatic complexity of some of the for loops
|
||||||
|
# * There needs to be one point of truth for the SD factor
|
||||||
|
# * Replace most paste() functions with printf()
|
||||||
#
|
#
|
||||||
# NOTES
|
# NOTES
|
||||||
#
|
#
|
||||||
@@ -1930,6 +1937,12 @@ r_add_shift_values() {
|
|||||||
|
|
||||||
wrapper r_create_heat_maps
|
wrapper r_create_heat_maps
|
||||||
# @description Execute createHeatMaps.R
|
# @description Execute createHeatMaps.R
|
||||||
|
#
|
||||||
|
# TODO
|
||||||
|
# * Needs more looping for brevity
|
||||||
|
#
|
||||||
|
#
|
||||||
|
#
|
||||||
# @arg $1 string The final shift table (REMcWithShift.csv)
|
# @arg $1 string The final shift table (REMcWithShift.csv)
|
||||||
# @arg $2 string The output directory
|
# @arg $2 string The output directory
|
||||||
r_create_heat_maps() {
|
r_create_heat_maps() {
|
||||||
@@ -2151,7 +2164,7 @@ generate_markdown() {
|
|||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
# @description The main loop of script-run-workflow
|
# @description The main loop of qhtcp-workflow
|
||||||
# May eventually need to add git ops
|
# May eventually need to add git ops
|
||||||
# Passes on arguments
|
# Passes on arguments
|
||||||
# Most variables in main() are user configurable or can be overriden by env
|
# Most variables in main() are user configurable or can be overriden by env
|
||||||
|
|||||||
Reference in New Issue
Block a user