Rollup before parallelization
This commit is contained in:
@@ -1,5 +1,5 @@
|
|||||||
linters: linters_with_defaults(
|
linters: linters_with_defaults(
|
||||||
object_name_linter = NULL,
|
# object_name_linter = NULL,
|
||||||
object_usage_linter = NULL,
|
object_usage_linter = NULL,
|
||||||
commented_code_linter = NULL,
|
commented_code_linter = NULL,
|
||||||
trailing_whitespace_linter(allow_empty_lines = TRUE),
|
trailing_whitespace_linter(allow_empty_lines = TRUE),
|
||||||
|
|||||||
@@ -33,7 +33,7 @@ Insert a general description of Q-HTCP and the Q-HTCP process here.
|
|||||||
* [pl_gtf_terms2tsv](#plgtfterms2tsv)
|
* [pl_gtf_terms2tsv](#plgtfterms2tsv)
|
||||||
* [py_gtf_concat](#pygtfconcat)
|
* [py_gtf_concat](#pygtfconcat)
|
||||||
* [r_compile_gtf](#rcompilegtf)
|
* [r_compile_gtf](#rcompilegtf)
|
||||||
* [get_studies](#getstudies)
|
* [study_info](#studyinfo)
|
||||||
* [choose_easy_results](#chooseeasyresults)
|
* [choose_easy_results](#chooseeasyresults)
|
||||||
|
|
||||||
## Notes
|
## Notes
|
||||||
@@ -183,7 +183,7 @@ If you wish to install them manually, you can use the following information to d
|
|||||||
|
|
||||||
#### Perl
|
#### Perl
|
||||||
|
|
||||||
* `cpan File::Map ExtUtils::PkgConfig GD GO::TermFinder`
|
* `cpan -I -i File::Map ExtUtils::PkgConfig GD GO::TermFinder`
|
||||||
|
|
||||||
#### R
|
#### R
|
||||||
|
|
||||||
@@ -199,7 +199,7 @@ This module:
|
|||||||
|
|
||||||
* Initializes a project directory in the scans directory
|
* Initializes a project directory in the scans directory
|
||||||
|
|
||||||
TODO
|
:bulb: **TODO**
|
||||||
|
|
||||||
* Copy over source image directories from robot
|
* Copy over source image directories from robot
|
||||||
* MasterPlate_ file **should not be an xlsx file**, no portability
|
* MasterPlate_ file **should not be an xlsx file**, no portability
|
||||||
@@ -207,7 +207,7 @@ TODO
|
|||||||
* But moving forward should switch to csv or something open
|
* But moving forward should switch to csv or something open
|
||||||
* Do we need to sync a QHTCP template?
|
* Do we need to sync a QHTCP template?
|
||||||
|
|
||||||
NOTES
|
:memo: **NOTES**
|
||||||
|
|
||||||
* Copy over the images from the robot and then DO NOT TOUCH that directory except to copy from it
|
* Copy over the images from the robot and then DO NOT TOUCH that directory except to copy from it
|
||||||
* Write-protect (read-only) if we need to
|
* Write-protect (read-only) if we need to
|
||||||
@@ -522,12 +522,11 @@ TODO WIP
|
|||||||
System for Multi-QHTCP-Experiment Gene Interaction Profiling Analysis
|
System for Multi-QHTCP-Experiment Gene Interaction Profiling Analysis
|
||||||
|
|
||||||
* Functional rewrite of REMcMaster3.sh, RemcMaster2.sh, REMcJar2.sh, ExpFrontend.m, mProcess.sh, mFunction.sh, mComponent.sh
|
* Functional rewrite of REMcMaster3.sh, RemcMaster2.sh, REMcJar2.sh, ExpFrontend.m, mProcess.sh, mFunction.sh, mComponent.sh
|
||||||
* Added a newline character to the end of StudyInfo.csv so it is a valid text file
|
* Added a newline character to the end of the study info file so it is a valid text file
|
||||||
|
|
||||||
TODO
|
TODO
|
||||||
|
|
||||||
* Suggest renaming StudiesQHTCP to something like qhtcp qhtcp_output or output
|
* Suggest renaming StudiesQHTCP to something like qhtcp qhtcp_output or output
|
||||||
* Store StudyInfo somewhere better
|
|
||||||
* Move (hide) the study template somewhere else
|
* Move (hide) the study template somewhere else
|
||||||
* StudiesArchive should be smarter:
|
* StudiesArchive should be smarter:
|
||||||
* Create a database with as much information as possible
|
* Create a database with as much information as possible
|
||||||
@@ -592,7 +591,7 @@ TODO
|
|||||||
|
|
||||||
#### Arguments
|
#### Arguments
|
||||||
|
|
||||||
* **$1** (string): studyInfo file
|
* **$1** (string): study info file
|
||||||
|
|
||||||
### gtf
|
### gtf
|
||||||
|
|
||||||
@@ -640,14 +639,14 @@ TODO
|
|||||||
* Is GTAtemplate.R actually a template?
|
* Is GTAtemplate.R actually a template?
|
||||||
* Do we need to allow user customization?
|
* Do we need to allow user customization?
|
||||||
|
|
||||||
Files
|
INPUT
|
||||||
|
|
||||||
* [gene_association.sgd](https://downloads.yeastgenome.org/curation/chromosomal_feature/gene_association.sgd)
|
* [gene_association.sgd](https://downloads.yeastgenome.org/curation/chromosomal_feature/gene_association.sgd)
|
||||||
* go_terms.tab
|
* go_terms.tab
|
||||||
|
|
||||||
Output
|
OUTPUT
|
||||||
|
|
||||||
*
|
* Average_GOTerms_All.csv
|
||||||
|
|
||||||
#### Arguments
|
#### Arguments
|
||||||
|
|
||||||
@@ -663,11 +662,13 @@ PairwiseLK.R R script
|
|||||||
|
|
||||||
TODO
|
TODO
|
||||||
|
|
||||||
* Should move directory creation from PairwiseLK.R to gta module
|
* Move directory creation from PairwiseLK.R to gta module
|
||||||
|
* Needs better output filenames and directory organization
|
||||||
|
* Needs more for looping to reduce verbosity
|
||||||
|
|
||||||
Files
|
INPUT
|
||||||
|
|
||||||
*
|
* Average_GOTerms_All.csv
|
||||||
*
|
*
|
||||||
|
|
||||||
Output
|
Output
|
||||||
@@ -684,7 +685,7 @@ This wrapper:
|
|||||||
|
|
||||||
* **$1** (string): First Exp# name
|
* **$1** (string): First Exp# name
|
||||||
* **$2** (string): Second Exp# name
|
* **$2** (string): Second Exp# name
|
||||||
* **$3** (string): StudyInfo.csv file
|
* **$3** (string): study info file
|
||||||
* **$4** (string): output directory
|
* **$4** (string): output directory
|
||||||
|
|
||||||
### r_gta_heatmaps
|
### r_gta_heatmaps
|
||||||
@@ -693,9 +694,10 @@ TSHeatmaps5dev2.R R script
|
|||||||
|
|
||||||
TODO
|
TODO
|
||||||
|
|
||||||
* Script could use rename
|
* Rename
|
||||||
* Script should be refactored to automatically allow more studies
|
* Refactor to automatically allow more studies
|
||||||
* Script should be refactored with more looping to reduce verbosity
|
* Refactor with more looping to reduce verbosity
|
||||||
|
* Reduce cyclomatic complexity of some of the for loops
|
||||||
|
|
||||||
Files
|
Files
|
||||||
|
|
||||||
@@ -709,13 +711,13 @@ Output
|
|||||||
This wrapper:
|
This wrapper:
|
||||||
|
|
||||||
* The Term Specific Heatmaps are produced directly from the ../ExpStudy/Exp_/ZScores/ZScores_Interaction.csv file generated by the user modified interaction… .R script.
|
* The Term Specific Heatmaps are produced directly from the ../ExpStudy/Exp_/ZScores/ZScores_Interaction.csv file generated by the user modified interaction… .R script.
|
||||||
* The heatmap labeling is per the names the user wrote into the StudyInfo.txt spreadsheet.
|
* The heatmap labeling is per the names the user wrote into the study info file
|
||||||
* Verify that the All_SGD_GOTerms_for_QHTCPtk.csv found in ../Code is what you wish to use or if you wish to use a custom modified version.
|
* Verify that the All_SGD_GOTerms_for_QHTCPtk.csv found in ../Code is what you wish to use or if you wish to use a custom modified version.
|
||||||
* If you wish to use a custom modified version, create it and modify the TSHeatmaps template script (TSHeatmaps5dev2.R) and save it as a ‘TSH_study specific name’.
|
* If you wish to use a custom modified version, create it and modify the TSHeatmaps template script (TSHeatmaps5dev2.R) and save it as a ‘TSH_study specific name’.
|
||||||
|
|
||||||
#### Arguments
|
#### Arguments
|
||||||
|
|
||||||
* **$1** (string): StudyInfo.csv file
|
* **$1** (string): study info file
|
||||||
* **$2** (string): gene_ontology_edit.obo file
|
* **$2** (string): gene_ontology_edit.obo file
|
||||||
* **$3** (string): go_terms.tab file
|
* **$3** (string): go_terms.tab file
|
||||||
* **$4** (string): All_SGD_GOTerms_for_QHTCPtk.csv
|
* **$4** (string): All_SGD_GOTerms_for_QHTCPtk.csv
|
||||||
@@ -737,6 +739,14 @@ TODO
|
|||||||
* Re-enable disabled linter checks
|
* Re-enable disabled linter checks
|
||||||
* Reduce cyclomatic complexity of some of the for loops
|
* Reduce cyclomatic complexity of some of the for loops
|
||||||
* There needs to be one point of truth for the SD factor
|
* There needs to be one point of truth for the SD factor
|
||||||
|
* Replace most paste() functions with printf()
|
||||||
|
|
||||||
|
INPUT
|
||||||
|
|
||||||
|
* easy/results_std.txt
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
NOTES
|
NOTES
|
||||||
|
|
||||||
@@ -744,18 +754,26 @@ NOTES
|
|||||||
|
|
||||||
#### Arguments
|
#### Arguments
|
||||||
|
|
||||||
* **$1** (string): The input directory
|
* **$1** (string): The input results_std.txt
|
||||||
* **$2** (string): The zscores directory
|
* **$2** (string): The zscores directory
|
||||||
* **$3** (string): The study info file
|
* **$3** (string): The study info file
|
||||||
* **$4** (string): SGD_features.tab
|
* **$4** (string): SGD_features.tab
|
||||||
* **$5** (integer): delta SD background value (default: 5)
|
* **$5** (integer): experiment number
|
||||||
* **$6** (integer): experiment number
|
* **$6** (integer): delta SD background value (default: 3)
|
||||||
|
|
||||||
### r_join_interactions
|
### r_join_interactions
|
||||||
|
|
||||||
JoinInteractExps3dev.R creates REMcRdy_lm_only.csv and Shift_only.csv
|
JoinInteractExps3dev.R creates REMcRdy_lm_only.csv and Shift_only.csv
|
||||||
|
|
||||||
Output
|
TODO
|
||||||
|
|
||||||
|
* Needs more loops to reduce verbosity
|
||||||
|
|
||||||
|
INPUT
|
||||||
|
|
||||||
|
*
|
||||||
|
|
||||||
|
OUTPUT
|
||||||
|
|
||||||
* REMcRdy_lm_only.csv
|
* REMcRdy_lm_only.csv
|
||||||
* Shift_only.csv
|
* Shift_only.csv
|
||||||
@@ -765,7 +783,7 @@ Output
|
|||||||
|
|
||||||
* **$1** (string): The output directory
|
* **$1** (string): The output directory
|
||||||
* **$2** (string): The sd value
|
* **$2** (string): The sd value
|
||||||
* **$3** (string): The studyInfo file
|
* **$3** (string): The study info file
|
||||||
|
|
||||||
### java_extract
|
### java_extract
|
||||||
|
|
||||||
@@ -785,10 +803,10 @@ NOTE
|
|||||||
|
|
||||||
#### Arguments
|
#### Arguments
|
||||||
|
|
||||||
* **$1** (string): GeneByGOAttributeMatrix_nofiltering-2009Dec07.tab
|
* **$1** (string): The output directory
|
||||||
* **$2** (string): ORF_List_Without_DAmPs.txt
|
* **$2** (string): ORF_List_Without_DAmPs.txt
|
||||||
* **$3** (string): REMcRdy_lm_only.csv
|
* **$3** (string): REMcRdy_lm_only.csv
|
||||||
* **$4** (string): The output directory
|
* **$4** (string): GeneByGOAttributeMatrix_nofiltering-2009Dec07.tab
|
||||||
* **$5** (string): The output file
|
* **$5** (string): The output file
|
||||||
|
|
||||||
#### Exit codes
|
#### Exit codes
|
||||||
@@ -805,13 +823,25 @@ and output "REMcWithShift.csv" for use with the REMc heat maps
|
|||||||
|
|
||||||
* **$1** (string): REMcRdy_lm_only.csv-finalTable.csv
|
* **$1** (string): REMcRdy_lm_only.csv-finalTable.csv
|
||||||
* **$2** (string): Shift_only.csv
|
* **$2** (string): Shift_only.csv
|
||||||
* **$3** (string): StudyInfo.csv file
|
* **$3** (string): study info file
|
||||||
* **$4** (string): The sd value
|
* **$4** (string): sd value
|
||||||
|
|
||||||
### r_create_heat_maps
|
### r_create_heat_maps
|
||||||
|
|
||||||
Execute createHeatMaps.R
|
Execute createHeatMaps.R
|
||||||
|
|
||||||
|
INPUT
|
||||||
|
|
||||||
|
* REMcWithShift.csv
|
||||||
|
|
||||||
|
OUTPUT
|
||||||
|
|
||||||
|
* compiledREMcHeatmaps.pdf
|
||||||
|
|
||||||
|
TODO
|
||||||
|
|
||||||
|
* Needs more looping for brevity
|
||||||
|
|
||||||
#### Arguments
|
#### Arguments
|
||||||
|
|
||||||
* **$1** (string): The final shift table (REMcWithShift.csv)
|
* **$1** (string): The final shift table (REMcWithShift.csv)
|
||||||
@@ -832,7 +862,9 @@ Execute createHeatMapsAll.R
|
|||||||
|
|
||||||
Perform python dcon portion of GTF
|
Perform python dcon portion of GTF
|
||||||
|
|
||||||
Output
|
SCRIPT: [DconJG2.py](apps/python/DconJG2.py)
|
||||||
|
|
||||||
|
OUTPUT
|
||||||
|
|
||||||
* 1-0-0-finaltable.csv
|
* 1-0-0-finaltable.csv
|
||||||
|
|
||||||
@@ -844,9 +876,13 @@ Output
|
|||||||
### pl_gtf_analyze
|
### pl_gtf_analyze
|
||||||
|
|
||||||
Perl analyze wrapper
|
Perl analyze wrapper
|
||||||
This seems weird to me because we're just overwriting the same data for all set2 members
|
|
||||||
https://metacpan.org/dist/GO-TermFinder/view/examples/analyze.pl
|
SCRIPT: [analyze_v2.pl](https://metacpan.org/dist/GO-TermFinder/view/examples/analyze.pl)
|
||||||
Is there a reason you need a custom version and not the original from cpan?
|
|
||||||
|
TODO
|
||||||
|
|
||||||
|
* Are we just overwriting the same data for all set2 members?
|
||||||
|
* Why the custom version?
|
||||||
|
|
||||||
#### Arguments
|
#### Arguments
|
||||||
|
|
||||||
@@ -858,7 +894,10 @@ Is there a reason you need a custom version and not the original from cpan?
|
|||||||
### pl_gtf_terms2tsv
|
### pl_gtf_terms2tsv
|
||||||
|
|
||||||
Perl terms2tsv wrapper
|
Perl terms2tsv wrapper
|
||||||
Probably should be translated to shell/python
|
|
||||||
|
TODO
|
||||||
|
|
||||||
|
* Probably should be translated to shell/python
|
||||||
|
|
||||||
#### Arguments
|
#### Arguments
|
||||||
|
|
||||||
@@ -868,7 +907,10 @@ Probably should be translated to shell/python
|
|||||||
|
|
||||||
Python concat wrapper for GTF
|
Python concat wrapper for GTF
|
||||||
Concat the process ontology outputs from the /REMcReady_lm_only folder
|
Concat the process ontology outputs from the /REMcReady_lm_only folder
|
||||||
Probably should be translated to bash
|
|
||||||
|
TODO
|
||||||
|
|
||||||
|
* Probably should be translated to bash
|
||||||
|
|
||||||
#### Arguments
|
#### Arguments
|
||||||
|
|
||||||
@@ -883,24 +925,18 @@ Compile GTF in R
|
|||||||
|
|
||||||
* **$1** (string): gtf output directory
|
* **$1** (string): gtf output directory
|
||||||
|
|
||||||
### get_studies
|
### study_info
|
||||||
|
|
||||||
Parse study names from StudyInfo.csv files
|
Creates, modifies, and parses the study info file
|
||||||
|
|
||||||
TODO
|
TODO
|
||||||
|
|
||||||
* This whole wrapper should eventually be either
|
* Needs refactoring
|
||||||
* Removed
|
* Ended up combining a few functions into one
|
||||||
* Expanded into a file that stores all project/study settings (database)
|
|
||||||
* I had to had a new line to the end of StudyInfo.csv, may break things?
|
|
||||||
|
|
||||||
#### Arguments
|
|
||||||
|
|
||||||
* **$1** (string): Study info file
|
|
||||||
|
|
||||||
#### Variables set
|
#### Variables set
|
||||||
|
|
||||||
* **STUDIES_NUMS** (array): Contains Exp numbers
|
* **STUDIES_NUMS** (array): contains Exp numbers
|
||||||
|
|
||||||
#### Exit codes
|
#### Exit codes
|
||||||
|
|
||||||
|
|||||||
@@ -8,17 +8,16 @@
|
|||||||
# @arg $2 string gene_ontology_edit.obo file
|
# @arg $2 string gene_ontology_edit.obo file
|
||||||
# @arg $3 string go_terms.tab file
|
# @arg $3 string go_terms.tab file
|
||||||
# @arg $4 string All_SGD_GOTerms_for_QHTCPtk.csv
|
# @arg $4 string All_SGD_GOTerms_for_QHTCPtk.csv
|
||||||
# @arg $5 string ZScores_interaction.csv
|
# @arg $5 string base directory
|
||||||
# @arg $6 string base directory
|
# @arg $6 string output directory
|
||||||
# @arg $7 string output directory
|
|
||||||
|
|
||||||
library("ontologyIndex")
|
library("ontologyIndex")
|
||||||
library("ggplot2")
|
library("ggplot2")
|
||||||
library("RColorBrewer")
|
library("RColorBrewer")
|
||||||
library("grid")
|
library("grid")
|
||||||
library("ggthemes")
|
library("ggthemes")
|
||||||
#library("plotly")
|
# library("plotly")
|
||||||
#library("htmlwidgets")
|
# library("htmlwidgets")
|
||||||
library("extrafont")
|
library("extrafont")
|
||||||
library("stringr")
|
library("stringr")
|
||||||
library("org.Sc.sgd.db")
|
library("org.Sc.sgd.db")
|
||||||
@@ -31,10 +30,9 @@ study_info_file <- args[1]
|
|||||||
ontology_file <- args[2]
|
ontology_file <- args[2]
|
||||||
sgd_terms_tfile <- args[3]
|
sgd_terms_tfile <- args[3]
|
||||||
all_sgd_terms_csv <- args[4]
|
all_sgd_terms_csv <- args[4]
|
||||||
zscores_file <- args[5]
|
base_dir <- args[5]
|
||||||
base_dir <- args[6]
|
output_dir <- args[6]
|
||||||
output_dir <- args[7]
|
study_nums <- args[7:length(args)]
|
||||||
study_nums <- args[8:length(args)]
|
|
||||||
|
|
||||||
# Import standard tables used in Sean's code That should be copied to each ExpStudy
|
# Import standard tables used in Sean's code That should be copied to each ExpStudy
|
||||||
labels <- read.csv(file = study_info_file, stringsAsFactors = FALSE)
|
labels <- read.csv(file = study_info_file, stringsAsFactors = FALSE)
|
||||||
@@ -52,7 +50,7 @@ XX3[, 2] <- gsub(pattern = "/", replacement = "_", x = XX3[, 2])
|
|||||||
|
|
||||||
# Load input files
|
# Load input files
|
||||||
for (study_num in study_nums) {
|
for (study_num in study_nums) {
|
||||||
input_file <- file.path(base_dir, paste("Exp", study_num), zscores_file)
|
input_file <- file.path(base_dir, paste("Exp", study_num), zscores, "zscores_interaction.csv")
|
||||||
if (file.exists(input_file)) {
|
if (file.exists(input_file)) {
|
||||||
assign(paste(X, study_num), read.csv(file = input_file, stringsAsFactors = FALSE, header = TRUE))
|
assign(paste(X, study_num), read.csv(file = input_file, stringsAsFactors = FALSE, header = TRUE))
|
||||||
assign(paste(Name, study_num), labels[study_num, 2])
|
assign(paste(Name, study_num), labels[study_num, 2])
|
||||||
@@ -441,7 +439,14 @@ for (s in 1:dim(XX3)[1]) {
|
|||||||
}
|
}
|
||||||
|
|
||||||
if (Parent_Size > 2000) {
|
if (Parent_Size > 2000) {
|
||||||
pdf(file = paste(output_dir, XX3[s, 2], ".pdf", sep = ""), width = 12, height = 45, onefile = TRUE)
|
|
||||||
|
pdf(
|
||||||
|
file = file.path(output_dir, paste(XX3[s, 2], ".pdf", sep = "")),
|
||||||
|
width = 12,
|
||||||
|
height = 45,
|
||||||
|
onefile = TRUE
|
||||||
|
)
|
||||||
|
|
||||||
for (i in 1:length(GOTerm_parent)) {
|
for (i in 1:length(GOTerm_parent)) {
|
||||||
GO_Term <- GOTerm_parent[i]
|
GO_Term <- GOTerm_parent[i]
|
||||||
GO_Term_Num <- as.integer(str_split_fixed(as.character(GO_Term), "\\:", 2)[, 2])
|
GO_Term_Num <- as.integer(str_split_fixed(as.character(GO_Term), "\\:", 2)[, 2])
|
||||||
@@ -461,7 +466,7 @@ for (s in 1:dim(XX3)[1]) {
|
|||||||
keysize = 0.5, trace = "none", density.info = c("none"), margins = c(10, 8),
|
keysize = 0.5, trace = "none", density.info = c("none"), margins = c(10, 8),
|
||||||
na.color = "red", col = brewer.pal(11, "PuOr"),
|
na.color = "red", col = brewer.pal(11, "PuOr"),
|
||||||
main = GO_Term_Name,
|
main = GO_Term_Name,
|
||||||
#ColSideColors = ev_repeat,
|
# ColSideColors = ev_repeat,
|
||||||
labRow = as.character(Genes_Annotated_to_Term$Gene)
|
labRow = as.character(Genes_Annotated_to_Term$Gene)
|
||||||
))
|
))
|
||||||
}
|
}
|
||||||
@@ -470,7 +475,14 @@ for (s in 1:dim(XX3)[1]) {
|
|||||||
}
|
}
|
||||||
|
|
||||||
if (Parent_Size >= 1000 && Parent_Size <= 2000) {
|
if (Parent_Size >= 1000 && Parent_Size <= 2000) {
|
||||||
pdf(file = paste(output_dir, XX3[s, 2], ".pdf", sep = ""), width = 12, height = 35, onefile = TRUE)
|
|
||||||
|
pdf(
|
||||||
|
file = file.path(output_dir, paste(XX3[s, 2], ".pdf", sep = "")),
|
||||||
|
width = 12,
|
||||||
|
height = 35,
|
||||||
|
onefile = TRUE
|
||||||
|
)
|
||||||
|
|
||||||
for (i in 1:length(GOTerm_parent)) {
|
for (i in 1:length(GOTerm_parent)) {
|
||||||
GO_Term <- GOTerm_parent[i]
|
GO_Term <- GOTerm_parent[i]
|
||||||
GO_Term_Num <- as.integer(str_split_fixed(as.character(GO_Term), "\\:", 2)[, 2])
|
GO_Term_Num <- as.integer(str_split_fixed(as.character(GO_Term), "\\:", 2)[, 2])
|
||||||
@@ -490,7 +502,7 @@ for (s in 1:dim(XX3)[1]) {
|
|||||||
keysize = 0.5, trace = "none", density.info = c("none"), margins = c(10, 8),
|
keysize = 0.5, trace = "none", density.info = c("none"), margins = c(10, 8),
|
||||||
na.color = "red", col = brewer.pal(11, "PuOr"),
|
na.color = "red", col = brewer.pal(11, "PuOr"),
|
||||||
main = GO_Term_Name,
|
main = GO_Term_Name,
|
||||||
#ColSideColors = ev_repeat,
|
# ColSideColors = ev_repeat,
|
||||||
labRow = as.character(Genes_Annotated_to_Term$Gene)
|
labRow = as.character(Genes_Annotated_to_Term$Gene)
|
||||||
))
|
))
|
||||||
}
|
}
|
||||||
@@ -499,7 +511,14 @@ for (s in 1:dim(XX3)[1]) {
|
|||||||
}
|
}
|
||||||
|
|
||||||
if (Parent_Size >= 500 && Parent_Size <= 1000) {
|
if (Parent_Size >= 500 && Parent_Size <= 1000) {
|
||||||
pdf(file = paste(output_dir, XX3[s, 2], ".pdf", sep = ""), width = 12, height = 30, onefile = TRUE)
|
|
||||||
|
pdf(
|
||||||
|
file = file.path(output_dir, paste(XX3[s, 2], ".pdf", sep = "")),
|
||||||
|
width = 12,
|
||||||
|
height = 30,
|
||||||
|
onefile = TRUE
|
||||||
|
)
|
||||||
|
|
||||||
for (i in 1:length(GOTerm_parent)) {
|
for (i in 1:length(GOTerm_parent)) {
|
||||||
GO_Term <- GOTerm_parent[i]
|
GO_Term <- GOTerm_parent[i]
|
||||||
GO_Term_Num <- as.integer(str_split_fixed(as.character(GO_Term), "\\:", 2)[, 2])
|
GO_Term_Num <- as.integer(str_split_fixed(as.character(GO_Term), "\\:", 2)[, 2])
|
||||||
@@ -519,7 +538,7 @@ for (s in 1:dim(XX3)[1]) {
|
|||||||
keysize = 0.5, trace = "none", density.info = c("none"), margins = c(10, 8),
|
keysize = 0.5, trace = "none", density.info = c("none"), margins = c(10, 8),
|
||||||
na.color = "red", col = brewer.pal(11, "PuOr"),
|
na.color = "red", col = brewer.pal(11, "PuOr"),
|
||||||
main = GO_Term_Name,
|
main = GO_Term_Name,
|
||||||
#ColSideColors = ev_repeat,
|
# ColSideColors = ev_repeat,
|
||||||
labRow = as.character(Genes_Annotated_to_Term$Gene)
|
labRow = as.character(Genes_Annotated_to_Term$Gene)
|
||||||
))
|
))
|
||||||
}
|
}
|
||||||
@@ -528,7 +547,14 @@ for (s in 1:dim(XX3)[1]) {
|
|||||||
}
|
}
|
||||||
|
|
||||||
if (Parent_Size >= 200 && Parent_Size <= 500) {
|
if (Parent_Size >= 200 && Parent_Size <= 500) {
|
||||||
pdf(file = paste(output_dir, XX3[s, 2], ".pdf", sep = ""), width = 12, height = 25, onefile = TRUE)
|
|
||||||
|
pdf(
|
||||||
|
file = file.path(output_dir, paste(XX3[s, 2], ".pdf", sep = "")),
|
||||||
|
width = 12,
|
||||||
|
height = 25,
|
||||||
|
onefile = TRUE
|
||||||
|
)
|
||||||
|
|
||||||
for (i in 1:length(GOTerm_parent)) {
|
for (i in 1:length(GOTerm_parent)) {
|
||||||
GO_Term <- GOTerm_parent[i]
|
GO_Term <- GOTerm_parent[i]
|
||||||
GO_Term_Num <- as.integer(str_split_fixed(as.character(GO_Term), "\\:", 2)[, 2])
|
GO_Term_Num <- as.integer(str_split_fixed(as.character(GO_Term), "\\:", 2)[, 2])
|
||||||
@@ -548,7 +574,7 @@ for (s in 1:dim(XX3)[1]) {
|
|||||||
keysize = 0.5, trace = "none", density.info = c("none"), margins = c(10, 8),
|
keysize = 0.5, trace = "none", density.info = c("none"), margins = c(10, 8),
|
||||||
na.color = "red", col = brewer.pal(11, "PuOr"),
|
na.color = "red", col = brewer.pal(11, "PuOr"),
|
||||||
main = GO_Term_Name,
|
main = GO_Term_Name,
|
||||||
#ColSideColors = ev_repeat,
|
# ColSideColors = ev_repeat,
|
||||||
labRow = as.character(Genes_Annotated_to_Term$Gene)
|
labRow = as.character(Genes_Annotated_to_Term$Gene)
|
||||||
))
|
))
|
||||||
}
|
}
|
||||||
@@ -557,7 +583,14 @@ for (s in 1:dim(XX3)[1]) {
|
|||||||
}
|
}
|
||||||
|
|
||||||
if (Parent_Size >= 100 && Parent_Size <= 200) {
|
if (Parent_Size >= 100 && Parent_Size <= 200) {
|
||||||
pdf(file = paste(output_dir, XX3[s, 2], ".pdf", sep = ""), width = 12, height = 20, onefile = TRUE)
|
|
||||||
|
pdf(
|
||||||
|
file = file.path(output_dir, paste(XX3[s, 2], ".pdf", sep = "")),
|
||||||
|
width = 12,
|
||||||
|
height = 20,
|
||||||
|
onefile = TRUE
|
||||||
|
)
|
||||||
|
|
||||||
for (i in 1:length(GOTerm_parent)) {
|
for (i in 1:length(GOTerm_parent)) {
|
||||||
GO_Term <- GOTerm_parent[i]
|
GO_Term <- GOTerm_parent[i]
|
||||||
GO_Term_Num <- as.integer(str_split_fixed(as.character(GO_Term), "\\:", 2)[, 2])
|
GO_Term_Num <- as.integer(str_split_fixed(as.character(GO_Term), "\\:", 2)[, 2])
|
||||||
@@ -577,7 +610,7 @@ for (s in 1:dim(XX3)[1]) {
|
|||||||
keysize = 0.5, trace = "none", density.info = c("none"), margins = c(10, 8),
|
keysize = 0.5, trace = "none", density.info = c("none"), margins = c(10, 8),
|
||||||
na.color = "red", col = brewer.pal(11, "PuOr"),
|
na.color = "red", col = brewer.pal(11, "PuOr"),
|
||||||
main = GO_Term_Name,
|
main = GO_Term_Name,
|
||||||
#ColSideColors = ev_repeat,
|
# ColSideColors = ev_repeat,
|
||||||
labRow = as.character(Genes_Annotated_to_Term$Gene)
|
labRow = as.character(Genes_Annotated_to_Term$Gene)
|
||||||
))
|
))
|
||||||
}
|
}
|
||||||
@@ -586,7 +619,14 @@ for (s in 1:dim(XX3)[1]) {
|
|||||||
}
|
}
|
||||||
|
|
||||||
if (Parent_Size >= 60 && Parent_Size <= 100) {
|
if (Parent_Size >= 60 && Parent_Size <= 100) {
|
||||||
pdf(file = paste(output_dir, XX3[s, 2], ".pdf", sep = ""), width = 12, height = 15, onefile = TRUE)
|
|
||||||
|
pdf(
|
||||||
|
file = file.path(output_dir, paste(XX3[s, 2], ".pdf", sep = "")),
|
||||||
|
width = 12,
|
||||||
|
height = 15,
|
||||||
|
onefile = TRUE
|
||||||
|
)
|
||||||
|
|
||||||
for (i in 1:length(GOTerm_parent)) {
|
for (i in 1:length(GOTerm_parent)) {
|
||||||
GO_Term <- GOTerm_parent[i]
|
GO_Term <- GOTerm_parent[i]
|
||||||
GO_Term_Num <- as.integer(str_split_fixed(as.character(GO_Term), "\\:", 2)[, 2])
|
GO_Term_Num <- as.integer(str_split_fixed(as.character(GO_Term), "\\:", 2)[, 2])
|
||||||
@@ -606,7 +646,7 @@ for (s in 1:dim(XX3)[1]) {
|
|||||||
keysize = 0.5, trace = "none", density.info = c("none"), margins = c(10, 8),
|
keysize = 0.5, trace = "none", density.info = c("none"), margins = c(10, 8),
|
||||||
na.color = "red", col = brewer.pal(11, "PuOr"),
|
na.color = "red", col = brewer.pal(11, "PuOr"),
|
||||||
main = GO_Term_Name,
|
main = GO_Term_Name,
|
||||||
#ColSideColors = ev_repeat,
|
# ColSideColors = ev_repeat,
|
||||||
labRow = as.character(Genes_Annotated_to_Term$Gene)
|
labRow = as.character(Genes_Annotated_to_Term$Gene)
|
||||||
))
|
))
|
||||||
}
|
}
|
||||||
@@ -615,7 +655,14 @@ for (s in 1:dim(XX3)[1]) {
|
|||||||
}
|
}
|
||||||
|
|
||||||
if (Parent_Size >= 30 && Parent_Size <= 60) {
|
if (Parent_Size >= 30 && Parent_Size <= 60) {
|
||||||
pdf(file = paste(output_dir, XX3[s, 2], ".pdf", sep = ""), width = 12, height = 10, onefile = TRUE)
|
|
||||||
|
pdf(
|
||||||
|
file = file.path(output_dir, paste(XX3[s, 2], ".pdf", sep = "")),
|
||||||
|
width = 12,
|
||||||
|
height = 10,
|
||||||
|
onefile = TRUE
|
||||||
|
)
|
||||||
|
|
||||||
for (i in 1:length(GOTerm_parent)) {
|
for (i in 1:length(GOTerm_parent)) {
|
||||||
GO_Term <- GOTerm_parent[i]
|
GO_Term <- GOTerm_parent[i]
|
||||||
GO_Term_Num <- as.integer(str_split_fixed(as.character(GO_Term), "\\:", 2)[, 2])
|
GO_Term_Num <- as.integer(str_split_fixed(as.character(GO_Term), "\\:", 2)[, 2])
|
||||||
@@ -650,7 +697,7 @@ for (s in 1:dim(XX3)[1]) {
|
|||||||
keysize = 0.5, trace = "none", density.info = c("none"), margins = c(10, 8),
|
keysize = 0.5, trace = "none", density.info = c("none"), margins = c(10, 8),
|
||||||
na.color = "red", col = brewer.pal(11, "PuOr"),
|
na.color = "red", col = brewer.pal(11, "PuOr"),
|
||||||
main = GO_Term_Name,
|
main = GO_Term_Name,
|
||||||
#ColSideColors = ev_repeat,
|
# ColSideColors = ev_repeat,
|
||||||
labRow = as.character(Genes_Annotated_to_Term$Gene)
|
labRow = as.character(Genes_Annotated_to_Term$Gene)
|
||||||
))
|
))
|
||||||
}
|
}
|
||||||
@@ -660,7 +707,14 @@ for (s in 1:dim(XX3)[1]) {
|
|||||||
}
|
}
|
||||||
|
|
||||||
if (Parent_Size >= 3 && Parent_Size <= 30) {
|
if (Parent_Size >= 3 && Parent_Size <= 30) {
|
||||||
pdf(file = paste(output_dir, XX3[s, 2], ".pdf", sep = ""), width = 12, height = 7, onefile = TRUE)
|
|
||||||
|
pdf(
|
||||||
|
file = file.path(output_dir, paste(XX3[s, 2], ".pdf", sep = "")),
|
||||||
|
width = 12,
|
||||||
|
height = 7,
|
||||||
|
onefile = TRUE
|
||||||
|
)
|
||||||
|
|
||||||
for (i in 1:length(GOTerm_parent)) {
|
for (i in 1:length(GOTerm_parent)) {
|
||||||
GO_Term <- GOTerm_parent[i]
|
GO_Term <- GOTerm_parent[i]
|
||||||
GO_Term_Num <- as.integer(str_split_fixed(as.character(GO_Term), "\\:", 2)[, 2])
|
GO_Term_Num <- as.integer(str_split_fixed(as.character(GO_Term), "\\:", 2)[, 2])
|
||||||
@@ -704,7 +758,14 @@ for (s in 1:dim(XX3)[1]) {
|
|||||||
}
|
}
|
||||||
|
|
||||||
if (Parent_Size == 2) {
|
if (Parent_Size == 2) {
|
||||||
pdf(file = paste(output_dir, XX3[s, 2], ".pdf", sep = ""), width = 12, height = 7, onefile = TRUE)
|
|
||||||
|
pdf(
|
||||||
|
file = file.path(output_dir, paste(XX3[s, 2], ".pdf", sep = "")),
|
||||||
|
width = 12,
|
||||||
|
height = 7,
|
||||||
|
onefile = TRUE
|
||||||
|
)
|
||||||
|
|
||||||
for (i in 1:length(GOTerm_parent)) {
|
for (i in 1:length(GOTerm_parent)) {
|
||||||
GO_Term <- GOTerm_parent[i]
|
GO_Term <- GOTerm_parent[i]
|
||||||
GO_Term_Num <- as.integer(str_split_fixed(as.character(GO_Term), "\\:", 2)[, 2])
|
GO_Term_Num <- as.integer(str_split_fixed(as.character(GO_Term), "\\:", 2)[, 2])
|
||||||
|
|||||||
@@ -1,28 +1,27 @@
|
|||||||
#!/usr/bin/env Rscript
|
#!/usr/bin/env Rscript
|
||||||
# This script will make homology heatmaps for the REMc analysis
|
|
||||||
# This script didn't have any hard set inputs so I didn't bother
|
|
||||||
|
|
||||||
library(RColorBrewer)
|
library("RColorBrewer")
|
||||||
library(gplots)
|
library("gplots")
|
||||||
library(tidyverse)
|
library("tidyverse")
|
||||||
|
|
||||||
args <- commandArgs(TRUE)
|
args <- commandArgs(TRUE)
|
||||||
# Need to give the input "finalTable.csv" file after running REMc generated by eclipse
|
|
||||||
inputFinalTable <- file.path(args[1])
|
|
||||||
|
|
||||||
# Give the DAmP_list.txt as the third argument - will color the gene names differently
|
|
||||||
DAmPs <- file.path(Args[2])
|
|
||||||
DAmP_list <- read.delim(file = DAmPs, header = FALSE, stringsAsFactors = FALSE)
|
|
||||||
|
|
||||||
# Give the yeast human homology mapping as the fourth argument - will add the genes to the finalTable and use info for heatmaps
|
|
||||||
mapFile <- file.path(Args[3])
|
|
||||||
mapping <- read.csv(file = mapFile, stringsAsFactors = FALSE)
|
|
||||||
|
|
||||||
# Define the output path for the heatmaps - create this folder first - in linux terminal in the working folder use > mkdir filename_heatmaps
|
# Define the output path for the heatmaps - create this folder first - in linux terminal in the working folder use > mkdir filename_heatmaps
|
||||||
outputPath <- file.path(Args[4])
|
output_path <- file.path(Args[1])
|
||||||
|
|
||||||
|
# Need to give the input "finalTable.csv" file after running REMc generated by eclipse
|
||||||
|
final_table <- file.path(args[2])
|
||||||
|
|
||||||
|
# Give the damp_list.txt as the third argument - will color the gene names differently
|
||||||
|
damps <- file.path(Args[3])
|
||||||
|
damp_list <- read.delim(file = damps, header = FALSE, stringsAsFactors = FALSE)
|
||||||
|
|
||||||
|
# Give the yeast human homology mapping as the fourth argument - will add the genes to the finalTable and use info for heatmaps
|
||||||
|
map_file <- file.path(Args[4])
|
||||||
|
mapping <- read.csv(file = map_file, stringsAsFactors = FALSE)
|
||||||
|
|
||||||
# Read in finalTablewithShift
|
# Read in finalTablewithShift
|
||||||
hmapfile <- data.frame(read.csv(file = inputFinalTable, header = TRUE, sep = ",", stringsAsFactors = FALSE))
|
hmapfile <- data.frame(read.csv(file = final_table, header = TRUE, sep = ",", stringsAsFactors = FALSE))
|
||||||
|
|
||||||
# Map the finalTable to the human homolog file
|
# Map the finalTable to the human homolog file
|
||||||
hmapfile_map <- hmapfile
|
hmapfile_map <- hmapfile
|
||||||
@@ -46,11 +45,11 @@ hmapfile_w_homolog <- full_join(hmapfile_map, mapping, by = c("ORFMatch" = "ense
|
|||||||
hmapfile_w_homolog <- hmapfile_w_homolog[is.na(hmapfile_w_homolog$likelihood) == FASLE, ]
|
hmapfile_w_homolog <- hmapfile_w_homolog[is.na(hmapfile_w_homolog$likelihood) == FASLE, ]
|
||||||
|
|
||||||
# Write csv with all info from mapping file
|
# Write csv with all info from mapping file
|
||||||
write.csv(hmapfile_w_homolog, file.path(outputPath, paste(inputFinalTable, "_WithHomologAll.csv", sep = "")), row.names = FALSE)
|
write.csv(hmapfile_w_homolog, file.path(output_path, paste(final_table, "_WithHomologAll.csv", sep = "")), row.names = FALSE)
|
||||||
|
|
||||||
# Remove the non matches and output another mapping file - this is also one used to make heatmaps
|
# Remove the non matches and output another mapping file - this is also one used to make heatmaps
|
||||||
hmapfile_w_homolog <- hmapfile_w_homolog[is.na(hmapfile_w_homolog$external_gene_name_Human) == FALSE, ]
|
hmapfile_w_homolog <- hmapfile_w_homolog[is.na(hmapfile_w_homolog$external_gene_name_Human) == FALSE, ]
|
||||||
write.csv(hmapfile_w_homolog, file.path(outputPath, paste(inputFinalTable, "_WithHomologMatchesOnly.csv", sep = ""), row.names = FALSE))
|
write.csv(hmapfile_w_homolog, file.path(output_path, paste(final_table, "_WithHomologMatchesOnly.csv", sep = ""), row.names = FALSE))
|
||||||
|
|
||||||
# Add human gene name to the Gene column
|
# Add human gene name to the Gene column
|
||||||
hmapfile_w_homolog$Gene <- paste(hmapfile_w_homolog$Gene, hmapfile_w_homolog$external_gene_name_Human, sep = "/")
|
hmapfile_w_homolog$Gene <- paste(hmapfile_w_homolog$Gene, hmapfile_w_homolog$external_gene_name_Human, sep = "/")
|
||||||
@@ -176,14 +175,14 @@ if (grepl("Shift", colnames(hmapfile)[4], fixed = TRUE) == FALSE) {
|
|||||||
# m <- 0
|
# m <- 0
|
||||||
colnames_edit <- as.character(colnames(hmapfile)[4:(length(hmapfile[1, ]) - 3)])
|
colnames_edit <- as.character(colnames(hmapfile)[4:(length(hmapfile[1, ]) - 3)])
|
||||||
|
|
||||||
colnames(DAmP_list)[1] <- "ORF"
|
colnames(damp_list)[1] <- "ORF"
|
||||||
hmapfile$DAmPs <- "YKO"
|
hmapfile$damps <- "YKO"
|
||||||
colnames(hmapfile)[2] <- "ORF"
|
colnames(hmapfile)[2] <- "ORF"
|
||||||
try(hmapfile[hmapfile$ORF %in% DAmP_list$ORF, ]$DAmPs <- "YKD")
|
try(hmapfile[hmapfile$ORF %in% damp_list$ORF, ]$damps <- "YKD")
|
||||||
# X <- X[order(X$DAmPs,decreasing = TRUE),]
|
# X <- X[order(X$damps,decreasing = TRUE),]
|
||||||
hmapfile$color2 <- NA
|
hmapfile$color2 <- NA
|
||||||
try(hmapfile[hmapfile$DAmPs == "YKO", ]$color2 <- "black")
|
try(hmapfile[hmapfile$damps == "YKO", ]$color2 <- "black")
|
||||||
try(hmapfile[hmapfile$DAmPs == "YKD", ]$color2 <- "red")
|
try(hmapfile[hmapfile$damps == "YKD", ]$color2 <- "red")
|
||||||
|
|
||||||
hmapfile$color <- NA
|
hmapfile$color <- NA
|
||||||
try(hmapfile[hmapfile$hsapiens_homolog_orthology_type == "ortholog_many2many", ]$color <- "#F8766D")
|
try(hmapfile[hmapfile$hsapiens_homolog_orthology_type == "ortholog_many2many", ]$color <- "#F8766D")
|
||||||
@@ -231,7 +230,7 @@ for (i in 1:num_unique_clusts) {
|
|||||||
if (cluster_length != 1) {
|
if (cluster_length != 1) {
|
||||||
X0 <- as.matrix(cluster_data[, 4:(length(hmapfile[1, ]) - 6)])
|
X0 <- as.matrix(cluster_data[, 4:(length(hmapfile[1, ]) - 6)])
|
||||||
if (cluster_length >= 2001) {
|
if (cluster_length >= 2001) {
|
||||||
mypath <- file.path(outputPath, paste("cluster_", gsub(" ", "", cluster), ".pdf", sep = ""))
|
mypath <- file.path(output_path, paste("cluster_", gsub(" ", "", cluster), ".pdf", sep = ""))
|
||||||
pdf(file = mypath, height = 20, width = 15)
|
pdf(file = mypath, height = 20, width = 15)
|
||||||
heatmap.2(
|
heatmap.2(
|
||||||
x = X0,
|
x = X0,
|
||||||
@@ -251,7 +250,7 @@ for (i in 1:num_unique_clusts) {
|
|||||||
dev.off()
|
dev.off()
|
||||||
}
|
}
|
||||||
if (cluster_length >= 201 && cluster_length <= 2000) {
|
if (cluster_length >= 201 && cluster_length <= 2000) {
|
||||||
mypath <- file.path(outputPath, paste("cluster_", gsub(" ", "", cluster), ".pdf", sep = ""))
|
mypath <- file.path(output_path, paste("cluster_", gsub(" ", "", cluster), ".pdf", sep = ""))
|
||||||
pdf(file = mypath, height = 15, width = 12)
|
pdf(file = mypath, height = 15, width = 12)
|
||||||
heatmap.2(
|
heatmap.2(
|
||||||
x = X0,
|
x = X0,
|
||||||
@@ -270,7 +269,7 @@ for (i in 1:num_unique_clusts) {
|
|||||||
dev.off()
|
dev.off()
|
||||||
}
|
}
|
||||||
if (cluster_length >= 150 && cluster_length <= 200) {
|
if (cluster_length >= 150 && cluster_length <= 200) {
|
||||||
mypath <- file.path(outputPath, paste("cluster_", gsub(" ", "", cluster), ".pdf", sep = ""))
|
mypath <- file.path(output_path, paste("cluster_", gsub(" ", "", cluster), ".pdf", sep = ""))
|
||||||
pdf(file = mypath, height = 12, width = 12)
|
pdf(file = mypath, height = 12, width = 12)
|
||||||
heatmap.2(
|
heatmap.2(
|
||||||
x = X0,
|
x = X0,
|
||||||
@@ -288,7 +287,7 @@ for (i in 1:num_unique_clusts) {
|
|||||||
dev.off()
|
dev.off()
|
||||||
}
|
}
|
||||||
if (cluster_length >= 101 && cluster_length <= 149) {
|
if (cluster_length >= 101 && cluster_length <= 149) {
|
||||||
mypath <- file.path(outputPath, paste("cluster_", gsub(" ", "", cluster), ".pdf", sep = ""))
|
mypath <- file.path(output_path, paste("cluster_", gsub(" ", "", cluster), ".pdf", sep = ""))
|
||||||
pdf(file = mypath, height = 12, width = 12)
|
pdf(file = mypath, height = 12, width = 12)
|
||||||
heatmap.2(
|
heatmap.2(
|
||||||
x = X0,
|
x = X0,
|
||||||
@@ -306,7 +305,7 @@ for (i in 1:num_unique_clusts) {
|
|||||||
dev.off()
|
dev.off()
|
||||||
}
|
}
|
||||||
if (cluster_length >= 60 && cluster_length <= 100) {
|
if (cluster_length >= 60 && cluster_length <= 100) {
|
||||||
mypath <- file.path(outputPath, paste("cluster_", gsub(" ", "", cluster), ".pdf", sep = ""))
|
mypath <- file.path(output_path, paste("cluster_", gsub(" ", "", cluster), ".pdf", sep = ""))
|
||||||
pdf(file = mypath, height = 12, width = 12)
|
pdf(file = mypath, height = 12, width = 12)
|
||||||
heatmap.2(
|
heatmap.2(
|
||||||
x = X0,
|
x = X0,
|
||||||
@@ -324,7 +323,7 @@ for (i in 1:num_unique_clusts) {
|
|||||||
dev.off()
|
dev.off()
|
||||||
}
|
}
|
||||||
if (cluster_length <= 59 && cluster_length >= 30) {
|
if (cluster_length <= 59 && cluster_length >= 30) {
|
||||||
mypath <- file.path(outputPath, paste("cluster_", gsub(" ", "", cluster), ".pdf", sep = ""))
|
mypath <- file.path(output_path, paste("cluster_", gsub(" ", "", cluster), ".pdf", sep = ""))
|
||||||
pdf(file = mypath, height = 9, width = 12)
|
pdf(file = mypath, height = 9, width = 12)
|
||||||
heatmap.2(
|
heatmap.2(
|
||||||
x = X0,
|
x = X0,
|
||||||
@@ -342,7 +341,7 @@ for (i in 1:num_unique_clusts) {
|
|||||||
dev.off()
|
dev.off()
|
||||||
}
|
}
|
||||||
if (cluster_length <= 29) {
|
if (cluster_length <= 29) {
|
||||||
mypath <- file.path(outputPath, paste("cluster_", gsub(" ", "", cluster), ".pdf", sep = ""))
|
mypath <- file.path(output_path, paste("cluster_", gsub(" ", "", cluster), ".pdf", sep = ""))
|
||||||
pdf(file = mypath, height = 7, width = 12)
|
pdf(file = mypath, height = 7, width = 12)
|
||||||
heatmap.2(
|
heatmap.2(
|
||||||
x = X0,
|
x = X0,
|
||||||
|
|||||||
@@ -50,7 +50,7 @@ if (length(args) >= 5) {
|
|||||||
# ZScores_Interaction.csv
|
# ZScores_Interaction.csv
|
||||||
for (m in 1:length(zscores_file)) {
|
for (m in 1:length(zscores_file)) {
|
||||||
|
|
||||||
#zscores_file <- paste(Wstudy,"/",expName[m],'/ZScores/ZScores_Interaction.csv',sep="") #ArgsScore[1]
|
# zscores_file <- paste(Wstudy,"/",expName[m],'/ZScores/ZScores_Interaction.csv',sep="") #ArgsScore[1]
|
||||||
X <- read.csv(file = zscores_file[m], stringsAsFactors = FALSE, header = TRUE)
|
X <- read.csv(file = zscores_file[m], stringsAsFactors = FALSE, header = TRUE)
|
||||||
|
|
||||||
if (colnames(X)[1] == "OrfRep") {
|
if (colnames(X)[1] == "OrfRep") {
|
||||||
|
|||||||
File diff suppressed because it is too large
Load Diff
@@ -1,44 +1,45 @@
|
|||||||
#!/usr/bin/env Rscript
|
#!/usr/bin/env Rscript
|
||||||
# JoinInteractExps.R
|
# JoinInteractExps.R
|
||||||
|
|
||||||
library(plyr)
|
library("plyr")
|
||||||
library(sos)
|
library("sos")
|
||||||
library(dplyr)
|
library("dplyr")
|
||||||
|
|
||||||
args <- commandArgs(TRUE)
|
args <- commandArgs(TRUE)
|
||||||
|
|
||||||
# Set output dir
|
# Set output dir
|
||||||
if (length(args) >= 1) {
|
if (length(args) >= 1) {
|
||||||
outDir <- file.path(args[1])
|
out_dir <- file.path(args[1])
|
||||||
} else {
|
} else {
|
||||||
outDir <- "./" # for legacy workflow
|
out_dir <- "./" # for legacy workflow
|
||||||
}
|
}
|
||||||
|
|
||||||
# Set sd value
|
# Set sd value
|
||||||
if (length(args) >= 2) {
|
if (length(args) >= 2) {
|
||||||
sd <- args[2]
|
sd <- as.numeric(args[2])
|
||||||
} else {
|
} else {
|
||||||
sd <- 2 # default value
|
sd <- 2 # default value
|
||||||
}
|
}
|
||||||
print(paste("SD=", sd))
|
|
||||||
|
|
||||||
# Set studyInfo file
|
sprintf("SD value is: %f", sd)
|
||||||
|
|
||||||
|
# Set study_info file
|
||||||
if (length(args) >= 3) {
|
if (length(args) >= 3) {
|
||||||
studyInfo <- file.path(args[3])
|
study_info <- file.path(args[3])
|
||||||
} else {
|
} else {
|
||||||
studyInfo <- "../Code/StudyInfo.csv" # for legacy workflow
|
study_info <- "../Code/StudyInfo.csv" # for legacy workflow
|
||||||
}
|
}
|
||||||
|
|
||||||
studies <- args[3:length(args)]
|
studies <- args[3:length(args)]
|
||||||
inputFiles <- c()
|
input_files <- c()
|
||||||
for (study in 1:length(studies)) {
|
for (study in 1:length(studies)) {
|
||||||
zsFile <- file.path(study, "zscores", "zscores_interaction.csv")
|
zs_file <- file.path(study, "zscores", "zscores_interaction.csv")
|
||||||
if (file.exists(zsFile)) {
|
if (file.exists(zs_file)) {
|
||||||
inputFiles[study] <- zsFile
|
input_files[study] <- zs_file
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
print(length(inputFiles))
|
print(length(input_files))
|
||||||
|
|
||||||
# TODO this is better handled in a loop in case you want to compare more experiments?
|
# TODO this is better handled in a loop in case you want to compare more experiments?
|
||||||
# The input is already designed for this
|
# The input is already designed for this
|
||||||
@@ -46,38 +47,38 @@ print(length(inputFiles))
|
|||||||
# Join the two files at a time as a function of how many inputFile
|
# Join the two files at a time as a function of how many inputFile
|
||||||
# list the larger file first ? in this example X2 has the larger number of genes
|
# list the larger file first ? in this example X2 has the larger number of genes
|
||||||
# If X1 has a larger number of genes, switch the order of X1 and X2
|
# If X1 has a larger number of genes, switch the order of X1 and X2
|
||||||
if (length(inputFiles) == 2) {
|
if (length(input_files) == 2) {
|
||||||
X1 <- read.csv(file = inputFiles[1], stringsAsFactors = FALSE)
|
X1 <- read.csv(file = input_files[1], stringsAsFactors = FALSE)
|
||||||
X2 <- read.csv(file = inputFiles[2], stringsAsFactors = FALSE)
|
X2 <- read.csv(file = input_files[2], stringsAsFactors = FALSE)
|
||||||
X <- join(X1, X2, by = "OrfRep")
|
X <- join(X1, X2, by = "OrfRep")
|
||||||
OBH <- X[, order(colnames(X))] # OrderByHeader
|
OBH <- X[, order(colnames(X))] # OrderByHeader
|
||||||
headSel <- select(OBH, contains("OrfRep"), matches("Gene"),
|
headSel <- select(OBH, contains("OrfRep"), matches("Gene"),
|
||||||
contains("Z_lm_K"), contains("Z_Shift_K"), contains("Z_lm_L"), contains("Z_Shift_L"))
|
contains("Z_lm_K"), contains("Z_Shift_K"), contains("Z_lm_L"), contains("Z_Shift_L"))
|
||||||
headSel <- select(headSel, -"Gene.1") #remove "Gene.1 column
|
headSel <- select(headSel, -"Gene.1") # remove "Gene.1 column
|
||||||
headSel2 <- select(OBH, contains("OrfRep"), matches("Gene")) #Frame for interleaving Z_lm with Shift colums
|
headSel2 <- select(OBH, contains("OrfRep"), matches("Gene")) #Frame for interleaving Z_lm with Shift colums
|
||||||
headSel2 <- select(headSel2, -"Gene.1") #remove "Gene.1 column #Frame for interleaving Z_lm with Shift colums
|
headSel2 <- select(headSel2, -"Gene.1") # remove "Gene.1 column #Frame for interleaving Z_lm with Shift colums
|
||||||
} else if (length(inputFiles) == 3) {
|
} else if (length(input_files) == 3) {
|
||||||
X1 <- read.csv(file = inputFiles[1], stringsAsFactors = FALSE) #exp1File,stringsAsFactors = FALSE)
|
X1 <- read.csv(file = input_files[1], stringsAsFactors = FALSE) # exp1File,stringsAsFactors = FALSE)
|
||||||
X2 <- read.csv(file = inputFiles[2], stringsAsFactors = FALSE) #exp2File,stringsAsFactors = FALSE)
|
X2 <- read.csv(file = input_files[2], stringsAsFactors = FALSE) # exp2File,stringsAsFactors = FALSE)
|
||||||
X3 <- read.csv(file = inputFiles[3], stringsAsFactors = FALSE) #exp3File,stringsAsFactors = FALSE)
|
X3 <- read.csv(file = input_files[3], stringsAsFactors = FALSE) # exp3File,stringsAsFactors = FALSE)
|
||||||
X <- join(X1, X2, by = "OrfRep")
|
X <- join(X1, X2, by = "OrfRep")
|
||||||
X <- join(X, X3, by = "OrfRep")
|
X <- join(X, X3, by = "OrfRep")
|
||||||
OBH <- X[, order(colnames(X))] #OrderByHeader
|
OBH <- X[, order(colnames(X))] # OrderByHeader
|
||||||
headSel <- select(OBH, contains("OrfRep"), matches("Gene"),
|
headSel <- select(OBH, contains("OrfRep"), matches("Gene"),
|
||||||
contains("Z_lm_K"), contains("Z_Shift_K"), contains("Z_lm_L"), contains("Z_Shift_L"))
|
contains("Z_lm_K"), contains("Z_Shift_K"), contains("Z_lm_L"), contains("Z_Shift_L"))
|
||||||
headSel <- select(headSel, -"Gene.1", -"Gene.2")
|
headSel <- select(headSel, -"Gene.1", -"Gene.2")
|
||||||
headSel2 <- select(OBH, contains("OrfRep"), matches("Gene"))
|
headSel2 <- select(OBH, contains("OrfRep"), matches("Gene"))
|
||||||
headSel2 <- select(headSel2, -"Gene.1", -"Gene.2")
|
headSel2 <- select(headSel2, -"Gene.1", -"Gene.2")
|
||||||
|
|
||||||
} else if (length(inputFiles) == 4) {
|
} else if (length(input_files) == 4) {
|
||||||
X1 <- read.csv(file = inputFiles[1], stringsAsFactors = FALSE) #exp1File,stringsAsFactors = FALSE)
|
X1 <- read.csv(file = input_files[1], stringsAsFactors = FALSE) # exp1File,stringsAsFactors = FALSE)
|
||||||
X2 <- read.csv(file = inputFiles[2], stringsAsFactors = FALSE) #exp2File,stringsAsFactors = FALSE)
|
X2 <- read.csv(file = input_files[2], stringsAsFactors = FALSE) # exp2File,stringsAsFactors = FALSE)
|
||||||
X3 <- read.csv(file = inputFiles[3], stringsAsFactors = FALSE) #exp3File,stringsAsFactors = FALSE)
|
X3 <- read.csv(file = input_files[3], stringsAsFactors = FALSE) # exp3File,stringsAsFactors = FALSE)
|
||||||
X4 <- read.csv(file = inputFiles[4], stringsAsFactors = FALSE) #exp4File,stringsAsFactors = FALSE)
|
X4 <- read.csv(file = input_files[4], stringsAsFactors = FALSE) # exp4File,stringsAsFactors = FALSE)
|
||||||
X <- join(X1, X2, by = "OrfRep")
|
X <- join(X1, X2, by = "OrfRep")
|
||||||
X <- join(X, X3, by = "OrfRep")
|
X <- join(X, X3, by = "OrfRep")
|
||||||
X <- join(X, X4, by = "OrfRep")
|
X <- join(X, X4, by = "OrfRep")
|
||||||
OBH <- X[, order(colnames(X))] #OrderByHeader
|
OBH <- X[, order(colnames(X))] # OrderByHeader
|
||||||
headSel <- select(OBH, contains("OrfRep"), matches("Gene"),
|
headSel <- select(OBH, contains("OrfRep"), matches("Gene"),
|
||||||
contains("Z_lm_K"), contains("Z_Shift_K"), contains("Z_lm_L"), contains("Z_Shift_L"))
|
contains("Z_lm_K"), contains("Z_Shift_K"), contains("Z_lm_L"), contains("Z_Shift_L"))
|
||||||
headSel <- select(headSel, -"Gene.1", -"Gene.2", -"Gene.3")
|
headSel <- select(headSel, -"Gene.1", -"Gene.2", -"Gene.3")
|
||||||
@@ -221,13 +222,13 @@ if (std == 0) {
|
|||||||
# R places hidden "" around the header names. The following
|
# R places hidden "" around the header names. The following
|
||||||
# is intended to remove those quote so that the "" do not blow up the Java REMc.
|
# is intended to remove those quote so that the "" do not blow up the Java REMc.
|
||||||
# Use ,quote=F in the write.csv statement to fix R output file.
|
# Use ,quote=F in the write.csv statement to fix R output file.
|
||||||
# write.csv(combI,file.path(outDir,"CombinedKLzscores.csv"), row.names = FALSE)
|
# write.csv(combI,file.path(out_dir,"CombinedKLzscores.csv"), row.names = FALSE)
|
||||||
write.csv(REMcRdy, file.path(outDir, "REMcRdy_lm_only.csv"), row.names = FALSE, quote = FALSE)
|
write.csv(REMcRdy, file.path(out_dir, "REMcRdy_lm_only.csv"), row.names = FALSE, quote = FALSE)
|
||||||
write.csv(shiftOnly, file.path(outDir, "Shift_only.csv"), row.names = FALSE, quote = FALSE)
|
write.csv(shiftOnly, file.path(out_dir, "Shift_only.csv"), row.names = FALSE, quote = FALSE)
|
||||||
#LabelStd <- read.table(file="./parameters.csv",stringsAsFactors = FALSE,sep = ",")
|
#LabelStd <- read.table(file="./parameters.csv",stringsAsFactors = FALSE,sep = ",")
|
||||||
|
|
||||||
LabelStd <- read.csv(file = studyInfo, stringsAsFactors = FALSE)
|
LabelStd <- read.csv(file = study_info, stringsAsFactors = FALSE)
|
||||||
print(std)
|
print(std)
|
||||||
LabelStd[, 4] <- as.numeric(std)
|
LabelStd[, 4] <- as.numeric(std)
|
||||||
write.csv(LabelStd, file = file.path(outDir, "parameters.csv"), row.names = FALSE)
|
write.csv(LabelStd, file = file.path(out_dir, "parameters.csv"), row.names = FALSE)
|
||||||
write.csv(LabelStd, file = studyInfo, row.names = FALSE)
|
write.csv(LabelStd, file = study_info, row.names = FALSE)
|
||||||
|
|||||||
File diff suppressed because it is too large
Load Diff
Reference in New Issue
Block a user