Rollup before parallelization
This commit is contained in:
@@ -33,7 +33,7 @@ Insert a general description of Q-HTCP and the Q-HTCP process here.
|
||||
* [pl_gtf_terms2tsv](#plgtfterms2tsv)
|
||||
* [py_gtf_concat](#pygtfconcat)
|
||||
* [r_compile_gtf](#rcompilegtf)
|
||||
* [get_studies](#getstudies)
|
||||
* [study_info](#studyinfo)
|
||||
* [choose_easy_results](#chooseeasyresults)
|
||||
|
||||
## Notes
|
||||
@@ -183,7 +183,7 @@ If you wish to install them manually, you can use the following information to d
|
||||
|
||||
#### Perl
|
||||
|
||||
* `cpan File::Map ExtUtils::PkgConfig GD GO::TermFinder`
|
||||
* `cpan -I -i File::Map ExtUtils::PkgConfig GD GO::TermFinder`
|
||||
|
||||
#### R
|
||||
|
||||
@@ -199,7 +199,7 @@ This module:
|
||||
|
||||
* Initializes a project directory in the scans directory
|
||||
|
||||
TODO
|
||||
:bulb: **TODO**
|
||||
|
||||
* Copy over source image directories from robot
|
||||
* MasterPlate_ file **should not be an xlsx file**, no portability
|
||||
@@ -207,7 +207,7 @@ TODO
|
||||
* But moving forward should switch to csv or something open
|
||||
* Do we need to sync a QHTCP template?
|
||||
|
||||
NOTES
|
||||
:memo: **NOTES**
|
||||
|
||||
* Copy over the images from the robot and then DO NOT TOUCH that directory except to copy from it
|
||||
* Write-protect (read-only) if we need to
|
||||
@@ -522,12 +522,11 @@ TODO WIP
|
||||
System for Multi-QHTCP-Experiment Gene Interaction Profiling Analysis
|
||||
|
||||
* Functional rewrite of REMcMaster3.sh, RemcMaster2.sh, REMcJar2.sh, ExpFrontend.m, mProcess.sh, mFunction.sh, mComponent.sh
|
||||
* Added a newline character to the end of StudyInfo.csv so it is a valid text file
|
||||
* Added a newline character to the end of the study info file so it is a valid text file
|
||||
|
||||
TODO
|
||||
|
||||
* Suggest renaming StudiesQHTCP to something like qhtcp qhtcp_output or output
|
||||
* Store StudyInfo somewhere better
|
||||
* Move (hide) the study template somewhere else
|
||||
* StudiesArchive should be smarter:
|
||||
* Create a database with as much information as possible
|
||||
@@ -592,7 +591,7 @@ TODO
|
||||
|
||||
#### Arguments
|
||||
|
||||
* **$1** (string): studyInfo file
|
||||
* **$1** (string): study info file
|
||||
|
||||
### gtf
|
||||
|
||||
@@ -640,14 +639,14 @@ TODO
|
||||
* Is GTAtemplate.R actually a template?
|
||||
* Do we need to allow user customization?
|
||||
|
||||
Files
|
||||
INPUT
|
||||
|
||||
* [gene_association.sgd](https://downloads.yeastgenome.org/curation/chromosomal_feature/gene_association.sgd)
|
||||
* go_terms.tab
|
||||
|
||||
Output
|
||||
OUTPUT
|
||||
|
||||
*
|
||||
* Average_GOTerms_All.csv
|
||||
|
||||
#### Arguments
|
||||
|
||||
@@ -663,11 +662,13 @@ PairwiseLK.R R script
|
||||
|
||||
TODO
|
||||
|
||||
* Should move directory creation from PairwiseLK.R to gta module
|
||||
* Move directory creation from PairwiseLK.R to gta module
|
||||
* Needs better output filenames and directory organization
|
||||
* Needs more for looping to reduce verbosity
|
||||
|
||||
Files
|
||||
INPUT
|
||||
|
||||
*
|
||||
* Average_GOTerms_All.csv
|
||||
*
|
||||
|
||||
Output
|
||||
@@ -684,7 +685,7 @@ This wrapper:
|
||||
|
||||
* **$1** (string): First Exp# name
|
||||
* **$2** (string): Second Exp# name
|
||||
* **$3** (string): StudyInfo.csv file
|
||||
* **$3** (string): study info file
|
||||
* **$4** (string): output directory
|
||||
|
||||
### r_gta_heatmaps
|
||||
@@ -693,9 +694,10 @@ TSHeatmaps5dev2.R R script
|
||||
|
||||
TODO
|
||||
|
||||
* Script could use rename
|
||||
* Script should be refactored to automatically allow more studies
|
||||
* Script should be refactored with more looping to reduce verbosity
|
||||
* Rename
|
||||
* Refactor to automatically allow more studies
|
||||
* Refactor with more looping to reduce verbosity
|
||||
* Reduce cyclomatic complexity of some of the for loops
|
||||
|
||||
Files
|
||||
|
||||
@@ -709,13 +711,13 @@ Output
|
||||
This wrapper:
|
||||
|
||||
* The Term Specific Heatmaps are produced directly from the ../ExpStudy/Exp_/ZScores/ZScores_Interaction.csv file generated by the user modified interaction… .R script.
|
||||
* The heatmap labeling is per the names the user wrote into the StudyInfo.txt spreadsheet.
|
||||
* The heatmap labeling is per the names the user wrote into the study info file
|
||||
* Verify that the All_SGD_GOTerms_for_QHTCPtk.csv found in ../Code is what you wish to use or if you wish to use a custom modified version.
|
||||
* If you wish to use a custom modified version, create it and modify the TSHeatmaps template script (TSHeatmaps5dev2.R) and save it as a ‘TSH_study specific name’.
|
||||
|
||||
#### Arguments
|
||||
|
||||
* **$1** (string): StudyInfo.csv file
|
||||
* **$1** (string): study info file
|
||||
* **$2** (string): gene_ontology_edit.obo file
|
||||
* **$3** (string): go_terms.tab file
|
||||
* **$4** (string): All_SGD_GOTerms_for_QHTCPtk.csv
|
||||
@@ -737,6 +739,14 @@ TODO
|
||||
* Re-enable disabled linter checks
|
||||
* Reduce cyclomatic complexity of some of the for loops
|
||||
* There needs to be one point of truth for the SD factor
|
||||
* Replace most paste() functions with printf()
|
||||
|
||||
INPUT
|
||||
|
||||
* easy/results_std.txt
|
||||
|
||||
|
||||
|
||||
|
||||
NOTES
|
||||
|
||||
@@ -744,18 +754,26 @@ NOTES
|
||||
|
||||
#### Arguments
|
||||
|
||||
* **$1** (string): The input directory
|
||||
* **$1** (string): The input results_std.txt
|
||||
* **$2** (string): The zscores directory
|
||||
* **$3** (string): The study info file
|
||||
* **$4** (string): SGD_features.tab
|
||||
* **$5** (integer): delta SD background value (default: 5)
|
||||
* **$6** (integer): experiment number
|
||||
* **$5** (integer): experiment number
|
||||
* **$6** (integer): delta SD background value (default: 3)
|
||||
|
||||
### r_join_interactions
|
||||
|
||||
JoinInteractExps3dev.R creates REMcRdy_lm_only.csv and Shift_only.csv
|
||||
|
||||
Output
|
||||
TODO
|
||||
|
||||
* Needs more loops to reduce verbosity
|
||||
|
||||
INPUT
|
||||
|
||||
*
|
||||
|
||||
OUTPUT
|
||||
|
||||
* REMcRdy_lm_only.csv
|
||||
* Shift_only.csv
|
||||
@@ -765,7 +783,7 @@ Output
|
||||
|
||||
* **$1** (string): The output directory
|
||||
* **$2** (string): The sd value
|
||||
* **$3** (string): The studyInfo file
|
||||
* **$3** (string): The study info file
|
||||
|
||||
### java_extract
|
||||
|
||||
@@ -785,10 +803,10 @@ NOTE
|
||||
|
||||
#### Arguments
|
||||
|
||||
* **$1** (string): GeneByGOAttributeMatrix_nofiltering-2009Dec07.tab
|
||||
* **$1** (string): The output directory
|
||||
* **$2** (string): ORF_List_Without_DAmPs.txt
|
||||
* **$3** (string): REMcRdy_lm_only.csv
|
||||
* **$4** (string): The output directory
|
||||
* **$4** (string): GeneByGOAttributeMatrix_nofiltering-2009Dec07.tab
|
||||
* **$5** (string): The output file
|
||||
|
||||
#### Exit codes
|
||||
@@ -805,13 +823,25 @@ and output "REMcWithShift.csv" for use with the REMc heat maps
|
||||
|
||||
* **$1** (string): REMcRdy_lm_only.csv-finalTable.csv
|
||||
* **$2** (string): Shift_only.csv
|
||||
* **$3** (string): StudyInfo.csv file
|
||||
* **$4** (string): The sd value
|
||||
* **$3** (string): study info file
|
||||
* **$4** (string): sd value
|
||||
|
||||
### r_create_heat_maps
|
||||
|
||||
Execute createHeatMaps.R
|
||||
|
||||
INPUT
|
||||
|
||||
* REMcWithShift.csv
|
||||
|
||||
OUTPUT
|
||||
|
||||
* compiledREMcHeatmaps.pdf
|
||||
|
||||
TODO
|
||||
|
||||
* Needs more looping for brevity
|
||||
|
||||
#### Arguments
|
||||
|
||||
* **$1** (string): The final shift table (REMcWithShift.csv)
|
||||
@@ -832,7 +862,9 @@ Execute createHeatMapsAll.R
|
||||
|
||||
Perform python dcon portion of GTF
|
||||
|
||||
Output
|
||||
SCRIPT: [DconJG2.py](apps/python/DconJG2.py)
|
||||
|
||||
OUTPUT
|
||||
|
||||
* 1-0-0-finaltable.csv
|
||||
|
||||
@@ -844,9 +876,13 @@ Output
|
||||
### pl_gtf_analyze
|
||||
|
||||
Perl analyze wrapper
|
||||
This seems weird to me because we're just overwriting the same data for all set2 members
|
||||
https://metacpan.org/dist/GO-TermFinder/view/examples/analyze.pl
|
||||
Is there a reason you need a custom version and not the original from cpan?
|
||||
|
||||
SCRIPT: [analyze_v2.pl](https://metacpan.org/dist/GO-TermFinder/view/examples/analyze.pl)
|
||||
|
||||
TODO
|
||||
|
||||
* Are we just overwriting the same data for all set2 members?
|
||||
* Why the custom version?
|
||||
|
||||
#### Arguments
|
||||
|
||||
@@ -858,7 +894,10 @@ Is there a reason you need a custom version and not the original from cpan?
|
||||
### pl_gtf_terms2tsv
|
||||
|
||||
Perl terms2tsv wrapper
|
||||
Probably should be translated to shell/python
|
||||
|
||||
TODO
|
||||
|
||||
* Probably should be translated to shell/python
|
||||
|
||||
#### Arguments
|
||||
|
||||
@@ -868,7 +907,10 @@ Probably should be translated to shell/python
|
||||
|
||||
Python concat wrapper for GTF
|
||||
Concat the process ontology outputs from the /REMcReady_lm_only folder
|
||||
Probably should be translated to bash
|
||||
|
||||
TODO
|
||||
|
||||
* Probably should be translated to bash
|
||||
|
||||
#### Arguments
|
||||
|
||||
@@ -883,24 +925,18 @@ Compile GTF in R
|
||||
|
||||
* **$1** (string): gtf output directory
|
||||
|
||||
### get_studies
|
||||
### study_info
|
||||
|
||||
Parse study names from StudyInfo.csv files
|
||||
Creates, modifies, and parses the study info file
|
||||
|
||||
TODO
|
||||
|
||||
* This whole wrapper should eventually be either
|
||||
* Removed
|
||||
* Expanded into a file that stores all project/study settings (database)
|
||||
* I had to had a new line to the end of StudyInfo.csv, may break things?
|
||||
|
||||
#### Arguments
|
||||
|
||||
* **$1** (string): Study info file
|
||||
* Needs refactoring
|
||||
* Ended up combining a few functions into one
|
||||
|
||||
#### Variables set
|
||||
|
||||
* **STUDIES_NUMS** (array): Contains Exp numbers
|
||||
* **STUDIES_NUMS** (array): contains Exp numbers
|
||||
|
||||
#### Exit codes
|
||||
|
||||
|
||||
Reference in New Issue
Block a user