Cleanup docs
This commit is contained in:
File diff suppressed because it is too large
Load Diff
Binary file not shown.
@@ -1,69 +1,23 @@
|
|||||||
#!/usr/bin/env bash
|
#!/usr/bin/env bash
|
||||||
# Copyright 2024 Bryan C. Roessler
|
# Copyright 2024 Bryan C. Roessler
|
||||||
# This program contains a mixture of code/pseudocode and shouldn't be run until this message is removed
|
|
||||||
#
|
#
|
||||||
# Allow indirect functions
|
# Allow indirect functions
|
||||||
# shellcheck disable=SC2317
|
# shellcheck disable=SC2317
|
||||||
#
|
#
|
||||||
# @name Hartman Lab Self-Documenting Workflow
|
# @name Hartman Lab QHTCP Workflow
|
||||||
# @brief One script to rule them all (see: xkcd #927)
|
# @brief An opinionated yet flexible QHTCP analysis framework for the Hartman Lab.
|
||||||
# @description A flexible yet opinionated analysis framework for the Hartman Lab
|
#
|
||||||
# There should be at least 4 subdirectories to organize Q-HTCP data and analysis. The parent directory is simply called 'Q-HTCP' and the 4 are subdirectories described below (Fig. 1):
|
# @description
|
||||||
# * **scans/**
|
|
||||||
# * This directory contains raw image data and image analysis results for the entire collection of Q-HTCP experiments.
|
|
||||||
# * We recommend each subdirectory within 'ExpJobs" should represent a single Q-HTCP experiment and be named using the following convention (AB yyyy_mmdd_PerturbatationsOfInterest): experimenter initials ('AB '), date ('yyyy_mmdd_'), and brief description ('drugs_medias').
|
|
||||||
# * Each subdirectory contains the Raw Image Folders for that experiment (a series of N folders with successive integer labels 1 to N, each folder containing the time series of images for a single cell array). It also contains a user-supplied subfolder, which must be named ''MasterPlateFiles" and must contain two excel files, one named 'DrugMedia_experimentdescription' and the other named 'MasterPlate_experimentdescription'. The bolded part of the file name including the underscore is required. The italicized part is optional description. Generally the 'DrugMedia_' file merits description.
|
|
||||||
# * If the standard MasterPlate_Template file is being used, it's not needed to customize then name. On the other hand if the template is modified, it is recommended to rename it and describe accordingly - a useful convention is to use the same name for the MP files as given to the experiment (i.e, the parent ExpJobs subdirectory described above) after the underscores.
|
|
||||||
# * The 'MasterPlate_' file contain associated cell array information (culture IDs for all of the cell arrays in the experiment) while the 'DrugMedia_' file contains information about the media that the cell array is printed to.
|
|
||||||
# * Together they encapsulate and define the experimental design.
|
|
||||||
# * The QHTCPImageFolders and 'MasterPlateFiles' folder are the inputs for image analysis with EASY software.
|
|
||||||
# * As further described below, EASY will automatically generate a 'Results' directory (within the ExpJobs/'ExperimentJob' folder) with a name that consists of a system-generated timestamp and an optional short description provided by the user (Fig.2). The 'Results' directory is created and entered, using the "File >> New Experiment" dropdown in EASY. Multiple 'Results' files may be created (and uniquely named) within an 'ExperimentJob' folder.
|
|
||||||
# * **apps/easy/**
|
|
||||||
# * This directory contains the GUI-enabled MATLAB software to accomplish image analysis and growth curve fitting.
|
|
||||||
# * EASY analyzes Q-HTCP image data within an 'ExperimentJob'' folder (described above; each cell array has its own folder containing its entire time series of images).
|
|
||||||
# * EASY analysis produces image quantification data and growth curve fitting results for each cell array; these results are subsequently assembled into a single file and labeled, using information contained in the 'MasterPlate_' and 'DrugMedia_' files in the 'MasterPlateFiles' subdirectory.
|
|
||||||
# * The final files (named '!!ResultsStd_.txt' or '!!ResultsELr_.txt') are produced in a subdirectory that EASY creates within the 'ExperimentJob' folder, named '/ResultsTimeStampDesc/PrintResults' (Fig. 2).
|
|
||||||
# * The /EASY directory is simply where the latest EASY version resides (additional versions in development or legacy versions may also be stored there).
|
|
||||||
# * The raw data inputs and result outputs for EASY are kept in the 'ExpJobs' directory.
|
|
||||||
# * EASY also outputs a '.mat' file that is stored in the 'matResults' folder and is named with the TimeStamp and user-provided name appended to the 'Results' folder name when 'New Experiment' is executed from the 'File' Dropdown menu in the EASY console.
|
|
||||||
# * **apps/ezview/**
|
|
||||||
# * This directory contains the GUI-enabled MATLAB software to conveniently and efficiently mine the raw cell array image data for a Q-HTCP experiment.
|
|
||||||
# * It takes the Results.m file (created by EASY software) as an input and permits the user to navigate through the raw image data and growth curve results for the experiment.
|
|
||||||
# * The /EZview provides a place for storing the the latest EZview version (as well as other EZview versions).
|
|
||||||
# * The /EZview provides a GUI for examining the EASY results as provided in the …/matResults/… .mat file.
|
|
||||||
# * **Master Plates**
|
|
||||||
# * This optional folder is a convenient place to store copies of the 'MasterPlate_' and a 'DrugMedia_' file templates, along with previously used files that may have been modified and could be reused or further modified to enable future analyses.
|
|
||||||
# * These two file types are required in the 'MasterPlateFiles' folder, which catalogs experimental information specific to individual Jobs in the ExpJobs folder, as described further below.
|
|
||||||
#
|
|
||||||
# NOTES:
|
|
||||||
# * For the time being I have tried to balance the recognizability of your current workflow with better practices that allow this program to function.
|
|
||||||
#
|
|
||||||
# TODO:
|
|
||||||
# * Scripts should be made modular enough that they can be stored in the same dir
|
|
||||||
# * Don't cd in scripts
|
|
||||||
# * If you must, do it in a subshell at least!
|
|
||||||
# * Pass variables
|
|
||||||
# * Pass options
|
|
||||||
# * Pass arguments
|
|
||||||
# * Variable scoping is horrible right now
|
|
||||||
# * I wrote this sequentially and tried to keep track the best I could
|
|
||||||
# * Local vars have a higher likelihood of being lower case, global vars are UPPER
|
|
||||||
#
|
#
|
||||||
# @option -p<value> | --project=<value> Include one or more projects in the analysis
|
# See the [User Input](#user-input) section for getting started.
|
||||||
# @option -m<value> | --module=<value> Include one or more modules in the analysis (default: all modules)
|
#
|
||||||
# @option -s<value> | --submodule=<value> <cmd> Pass arguments or commands to a submodule in the current project context
|
# Insert a general description of Q-HTCP and the Q-HTCP process here.
|
||||||
# @option -n<value> | --nomodule=<value> Exclude one or more modules in the analysis
|
|
||||||
# @option --markdown Generate the shdoc markdown file for this program
|
|
||||||
# @option -y | --yes | --auto Assume yes answer to all questions (non-interactive mode)
|
|
||||||
# @option -d | --debug Turn on extra debugging output
|
|
||||||
# @option -h | --help Print help message and exit (overrides other options)
|
|
||||||
shopt -s extglob
|
|
||||||
|
|
||||||
|
shopt -s extglob # Turn on extended globbing
|
||||||
DEBUG=1 # Turn debugging ON by default during development
|
DEBUG=1 # Turn debugging ON by default during development
|
||||||
|
|
||||||
# @section Help
|
# @description Use `--help` to print the help message.
|
||||||
# @description Print a helpful message
|
# @internal
|
||||||
# @noargs
|
|
||||||
print_help() {
|
print_help() {
|
||||||
debug "Running: ${FUNCNAME[0]}"
|
debug "Running: ${FUNCNAME[0]}"
|
||||||
|
|
||||||
@@ -79,7 +33,7 @@ print_help() {
|
|||||||
OPTIONS:
|
OPTIONS:
|
||||||
--project, -p PROJECT
|
--project, -p PROJECT
|
||||||
PROJECT should follow the pattern ${PROJECT_PREFIX}_PROJECT_NAME
|
PROJECT should follow the pattern ${PROJECT_PREFIX}_PROJECT_NAME
|
||||||
--module, -i MODULE[,MODULE...]
|
--module, -m MODULE[,MODULE...]
|
||||||
See MODULES section below for list of available modules
|
See MODULES section below for list of available modules
|
||||||
If no --include is specified, all modules are run
|
If no --include is specified, all modules are run
|
||||||
--submodule, -s SUBMODULE "[ARG1],[ARG2]..." (string of comma delimited arguments)
|
--submodule, -s SUBMODULE "[ARG1],[ARG2]..." (string of comma delimited arguments)
|
||||||
@@ -121,21 +75,89 @@ print_help() {
|
|||||||
EOF
|
EOF
|
||||||
}
|
}
|
||||||
|
|
||||||
|
# @section Notes
|
||||||
|
# @description
|
||||||
|
#
|
||||||
|
# ### TO-DO
|
||||||
|
#
|
||||||
|
# * Variable scoping is horrible right now
|
||||||
|
# * I wrote this sequentially and tried to keep track the best I could
|
||||||
|
# * Local vars have a higher likelihood of being lower case, global vars are UPPER
|
||||||
|
# * See MODULE specific TODOs below
|
||||||
|
#
|
||||||
|
# ### General guidelines for writing external scripts
|
||||||
|
#
|
||||||
|
# * External scripts must be modular enough to handle input and output from multiple directories
|
||||||
|
# * Don't cd in scripts (if you must, do it in a subshell!)
|
||||||
|
# * Pass variables
|
||||||
|
# * Pass options
|
||||||
|
# * Pass arguments
|
||||||
|
#
|
||||||
|
# ## Project layout
|
||||||
|
#
|
||||||
|
# **qhtcp-workflow/**
|
||||||
|
#
|
||||||
|
# **scans/**
|
||||||
|
#
|
||||||
|
# * This directory contains raw image data and image analysis results for the entire collection of Q-HTCP experiments.
|
||||||
|
# * Subdirectories within "scans" should represent a single Q-HTCP study and be named using the following convention: yyymmdd_username_experimentDescription
|
||||||
|
# * Each subdirectory contains the Raw Image Folders for that study.
|
||||||
|
# * Each Raw Image Folder contains a series of N folders with successive integer labels 1 to N, each folder containing the time series of images for a single cell array.
|
||||||
|
# * It also contains a user-supplied subfolder, which must be named "MasterPlateFiles" and must contain two excel files, one named 'DrugMedia_experimentDescription' and the other named 'MasterPlate_experimentDescription'.
|
||||||
|
# * If the standard MasterPlate_Template file is being used, it's not needed to customize then name.
|
||||||
|
# * If the template is modified, it is recommended to rename it and describe accordingly - a useful convention is to use the same experimentDescription for the MP files as given to the experiment
|
||||||
|
# * The 'MasterPlate_' file contain associated cell array information (culture IDs for all of the cell arrays in the experiment) while the 'DrugMedia_' file contains information about the media that the cell array is printed to.
|
||||||
|
# * Together they encapsulate and define the experimental design.
|
||||||
|
# * The QHTCPImageFolders and 'MasterPlateFiles' folder are the inputs for image analysis with EASY software.
|
||||||
|
# * As further described below, EASY will automatically generate a 'Results' directory (within the ExpJobs/'ExperimentJob' folder) with a name that consists of a system-generated timestamp and an optional short description provided by the user (Fig.2). The 'Results' directory is created and entered, using the "File >> New Experiment" dropdown in EASY. Multiple 'Results' files may be created (and uniquely named) within an 'ExperimentJob' folder.
|
||||||
|
|
||||||
# @section User Input
|
# **apps/easy/**
|
||||||
# @set PROJECTS array List of projects to work on
|
#
|
||||||
# @set MODULES array List of modules to run
|
# * This directory contains the GUI-enabled MATLAB software to accomplish image analysis and growth curve fitting.
|
||||||
# @set SUBMODULES array List of submodules and their arguments to run
|
# * EASY analyzes Q-HTCP image data within an 'ExperimentJob'' folder (described above; each cell array has its own folder containing its entire time series of images).
|
||||||
# @set EXCLUDE_MODULES array List of modules not to run
|
# * EASY analysis produces image quantification data and growth curve fitting results for each cell array; these results are subsequently assembled into a single file and labeled, using information contained in the 'MasterPlate_' and 'DrugMedia_' files in the 'MasterPlateFiles' subdirectory.
|
||||||
|
# * The final files (named '!!ResultsStd_.txt' or '!!ResultsELr_.txt') are produced in a subdirectory that EASY creates within the 'ExpJob#' folder, named '/ResultsTimeStampDesc/PrintResults' (Fig. 2).
|
||||||
|
# * The /EASY directory is simply where the latest EASY version resides (additional versions in development or legacy versions may also be stored there).
|
||||||
|
# * The raw data inputs and result outputs for EASY are kept in the 'ExpJobs' directory.
|
||||||
|
# * EASY also outputs a '.mat' file that is stored in the 'matResults' folder and is named with the TimeStamp and user-provided name appended to the 'Results' folder name when 'New Experiment' is executed from the 'File' Dropdown menu in the EASY console.
|
||||||
|
|
||||||
|
# **apps/ezview/**
|
||||||
|
#
|
||||||
|
# * This directory contains the GUI-enabled MATLAB software to conveniently and efficiently mine the raw cell array image data for a Q-HTCP experiment.
|
||||||
|
# * It takes the Results.m file (created by EASY software) as an input and permits the user to navigate through the raw image data and growth curve results for the experiment.
|
||||||
|
# * The /EZview provides a place for storing the the latest EZview version (as well as other EZview versions).
|
||||||
|
# * The /EZview provides a GUI for examining the EASY results as provided in the …/matResults/… .mat file.
|
||||||
|
#
|
||||||
|
# **Master Plates**
|
||||||
|
#
|
||||||
|
# * This optional folder is a convenient place to store copies of the 'MasterPlate_' and a 'DrugMedia_' file templates, along with previously used files that may have been modified and could be reused or further modified to enable future analyses.
|
||||||
|
# * These two file types are required in the 'MasterPlateFiles' folder, which catalogs experimental information specific to individual Jobs in the ExpJobs folder, as described further below.
|
||||||
|
#
|
||||||
|
#
|
||||||
|
# Some example decorators for markdown:
|
||||||
|
#
|
||||||
|
#
|
||||||
|
# @description
|
||||||
|
# `--project`, `--module`, `--nomodule`, and `--submodule` can be passed multiple times or with a comma-separated string
|
||||||
|
# @option -p<value> | --project=<value> One or more projects to analyze, can be passed multiple times or with a comma-separated string
|
||||||
|
# @option -m<value> | --module=<value> One or more modules to run (default: all), can be passed multiple times or with a comma-separated string
|
||||||
|
# @option -s<value> | --submodule=<value> Requires two arguments: the name of the submodule and its arguments, can be passed multiple times
|
||||||
|
# @option -n<value> | --nomodule=<value> One or more modules (default: none) to exclude from the analysis
|
||||||
|
# @option --markdown Generate the shdoc markdown file for this program
|
||||||
|
# @option -y | --yes | --auto Assume yes answer to all questions (non-interactive mode)
|
||||||
|
# @option -d | --debug Turn on extra debugging output
|
||||||
|
# @option -h | --help Print help message and exit (overrides other options)
|
||||||
|
# @set PROJECTS array List of projects to cycle through
|
||||||
|
# @set MODULES array List of modules to run on each project
|
||||||
|
# @set SUBMODULES array List of submodules and their arguments to run on each project
|
||||||
|
# @set EXCLUDE_MODULES array List of modules not to run on each project
|
||||||
# @set DEBUG int Turn debugging on
|
# @set DEBUG int Turn debugging on
|
||||||
# @set YES int Turn assume yes on
|
# @set YES int Turn assume yes on
|
||||||
# @description Creates array and switches from user input
|
|
||||||
# parse_input() takes all of the arguments passed to the script
|
|
||||||
parse_input() {
|
parse_input() {
|
||||||
debug "Running: ${FUNCNAME[0]} $*"
|
debug "Running: ${FUNCNAME[0]} $*"
|
||||||
|
|
||||||
long_opts="project:,module:,nomodule:,markdown,yes,auto,debug,help"
|
long_opts="project:,module:,submodule:,nomodule:,markdown,yes,auto,debug,help"
|
||||||
short_opts="+p:m:n:ydh"
|
short_opts="+p:m:s:n:ydh"
|
||||||
|
|
||||||
if input=$(getopt -o $short_opts -l $long_opts -- "$@"); then
|
if input=$(getopt -o $short_opts -l $long_opts -- "$@"); then
|
||||||
eval set -- "$input"
|
eval set -- "$input"
|
||||||
@@ -183,25 +205,28 @@ parse_input() {
|
|||||||
fi
|
fi
|
||||||
}
|
}
|
||||||
|
|
||||||
# @section Helper functions
|
# @section Modules
|
||||||
# @arg $1 array A module to initialize (add to ALL_MODULES)
|
# @description
|
||||||
# @set ALL_MODULES array A list of all available modules
|
#
|
||||||
# @internal
|
# A module contains a cohesive set of actions/experiments to run on a project
|
||||||
|
#
|
||||||
|
# Use a module to:
|
||||||
|
#
|
||||||
|
# * Build a new type of analysis from scratch
|
||||||
|
# * Generate project directories
|
||||||
|
# * Group multiple submodules (and modules) into a larger task
|
||||||
|
# * Dictate the ordering of multiple submodules
|
||||||
|
# * Competently handle pushd and popd for their submodules if they do not reside in the SCANS/PROJECT_DIR
|
||||||
|
# * Call their submodules with the appropriate arguments
|
||||||
|
#
|
||||||
|
# @description
|
||||||
module() {
|
module() {
|
||||||
debug "Adding $1 module"
|
debug "Adding $1 module"
|
||||||
ALL_MODULES+=("$1")
|
ALL_MODULES+=("$1")
|
||||||
declare -gA "$1"
|
declare -gA "$1"
|
||||||
}
|
}
|
||||||
|
|
||||||
# @arg $1 array A submodule to initialize (add to ALL_SUBMODULES)
|
# @description Ask the user a yes/no question
|
||||||
# @set ALL_SUBMODULES array A list of all available submodules
|
|
||||||
# @internal
|
|
||||||
submodule() {
|
|
||||||
debug "Adding $1 submodule"
|
|
||||||
ALL_SUBMODULES+=("$1")
|
|
||||||
declare -gA "$1"
|
|
||||||
}
|
|
||||||
|
|
||||||
# @arg $1 string The question to ask
|
# @arg $1 string The question to ask
|
||||||
# @exitcode 0 If yes
|
# @exitcode 0 If yes
|
||||||
# @exitcode 1 If no
|
# @exitcode 1 If no
|
||||||
@@ -297,8 +322,10 @@ random_three_words() {
|
|||||||
|
|
||||||
printf "%s_" "${arr[@]}" | sed 's/_$//'
|
printf "%s_" "${arr[@]}" | sed 's/_$//'
|
||||||
}
|
}
|
||||||
|
|
||||||
# @description Backup one or more files to an incremented .bk file
|
# @description Backup one or more files to an incremented .bk file
|
||||||
# @exitcode backup iterator max 255
|
# @exitcode backup iterator max 255
|
||||||
|
# @internal
|
||||||
backup() {
|
backup() {
|
||||||
debug "Running: ${FUNCNAME[0]} $*"
|
debug "Running: ${FUNCNAME[0]} $*"
|
||||||
for f in "$@"; do
|
for f in "$@"; do
|
||||||
@@ -314,6 +341,7 @@ backup() {
|
|||||||
}
|
}
|
||||||
|
|
||||||
# @description Prints a helpful message add program start
|
# @description Prints a helpful message add program start
|
||||||
|
# @internal
|
||||||
interactive_header() {
|
interactive_header() {
|
||||||
debug "Running: ${FUNCNAME[0]}"
|
debug "Running: ${FUNCNAME[0]}"
|
||||||
|
|
||||||
@@ -456,46 +484,44 @@ interactive_header() {
|
|||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
# @section Modules
|
|
||||||
# @description A module contains a cohesive set of actions/experiments to run on a project
|
|
||||||
# Use a module when:
|
|
||||||
# * Building a new type of analysis from scratch
|
|
||||||
# * Generating project directories
|
|
||||||
# * Grouping multiple submodules (and modules) into a larger task
|
|
||||||
# * Dictating the ordering of multiple submodules
|
|
||||||
# * Modules should competently handle pushd and popd for their submodules if they do not reside in the SCANS/PROJECT_DIR
|
|
||||||
# * Apps and submodules should avoid changing directories
|
|
||||||
# * Pass input data from somewhere and output data somewhere
|
|
||||||
|
|
||||||
module install_dependencies
|
module install_dependencies
|
||||||
# @section Install dependencies
|
# @description This module will automatically install the dependencies for running QHTCP.
|
||||||
# @description Installs dependencies for the workflow
|
#
|
||||||
|
# If you wish to install them manually, you can use the following information to do so:
|
||||||
|
#
|
||||||
|
# #### System dependencies
|
||||||
|
#
|
||||||
|
# * R
|
||||||
|
# * Perl
|
||||||
|
# * Java
|
||||||
|
# * MATLAB
|
||||||
|
#
|
||||||
|
# #### MacOS
|
||||||
|
#
|
||||||
|
# * `export HOMEBREW_BREW_GIT_REMOTE=https://github.com/Homebrew/brew`
|
||||||
|
# * `/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"`
|
||||||
|
# * `cpan File::Map ExtUtils::PkgConfig GD GO::TermFinder`
|
||||||
|
# * `brew install graphiz gd pdftk-java pandoc shdoc nano rsync coreutils`
|
||||||
|
#
|
||||||
|
# #### Linux DEB
|
||||||
|
#
|
||||||
|
# * `apt install graphviz pandoc pdftk-java libgd-dev perl shdoc nano rsync coreutils libcurl-dev openssl-dev`
|
||||||
|
#
|
||||||
|
# #### Linux RPM
|
||||||
|
#
|
||||||
|
# * `dnf install graphviz pandoc pdftk-java gd-devel perl-CPAN shdoc nano rsync coreutils libcurl-devel openssl-devel`
|
||||||
|
#
|
||||||
|
# #### Perl
|
||||||
|
#
|
||||||
|
# * `cpan File::Map ExtUtils::PkgConfig GD GO::TermFinder`
|
||||||
|
#
|
||||||
|
# #### R
|
||||||
|
#
|
||||||
|
# * `install.packages(c('BiocManager', 'ontologyIndex', 'ggrepel', 'tidyverse', 'sos', 'openxlsx', 'ggplot2', 'plyr', 'extrafont', 'gridExtra', 'gplots', 'stringr', 'plotly', 'ggthemes', 'pandoc', 'rmarkdown', 'plotly', 'htmlwidgets'), dep=TRUE)`
|
||||||
|
# * `BiocManager::install('UCSC.utils')`
|
||||||
|
# * `BiocManager::install('org.Sc.sgd.db')`
|
||||||
#
|
#
|
||||||
#
|
#
|
||||||
#
|
|
||||||
# Dependencies
|
|
||||||
# * R
|
|
||||||
# * Perl
|
|
||||||
# * Java
|
|
||||||
# * MATLAB
|
|
||||||
#
|
|
||||||
# For MacOS
|
|
||||||
# * export HOMEBREW_BREW_GIT_REMOTE=https://github.com/Homebrew/brew
|
|
||||||
# * /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
|
|
||||||
# * cpan File::Map ExtUtils::PkgConfig GD GO::TermFinder
|
|
||||||
# * brew install graphiz gd pdftk-java pandoc shdoc nano rsync
|
|
||||||
#
|
|
||||||
# For Linux
|
|
||||||
# * cpan File::Map ExtUtils::PkgConfig GD GO::TermFinder
|
|
||||||
# * apt-get install graphviz libgd-dev pdftk-java pandoc shdoc nano rsync
|
|
||||||
# or
|
|
||||||
# * dnf install graphviz pandoc pdftk-java gd-devel shdoc nano rsync
|
|
||||||
#
|
|
||||||
# For R
|
|
||||||
# * install.packages(“BiocManager”)
|
|
||||||
# * BiocManager::install(“org.Sc.sgd.db”)
|
|
||||||
# * install.packages(c('ontologyIndex', 'ggrepel', 'tidyverse', 'sos', 'openxlsx'), dep=TRUE)
|
|
||||||
# @noargs
|
|
||||||
install_dependencies() {
|
install_dependencies() {
|
||||||
debug "Running: ${FUNCNAME[0]} $*"
|
debug "Running: ${FUNCNAME[0]} $*"
|
||||||
|
|
||||||
@@ -573,24 +599,27 @@ install_dependencies() {
|
|||||||
|
|
||||||
|
|
||||||
module init_project
|
module init_project
|
||||||
# @section Initialize a new project in the scans directory
|
|
||||||
# @description This function creates and initializes project directories
|
# @description This function creates and initializes project directories
|
||||||
# This module is responsible for the following tasks:
|
#
|
||||||
# * Initializing a project directory in the scans directory
|
# This module:
|
||||||
# * Initializing a QHTCP project directory in the qhtcp directory
|
#
|
||||||
|
# * Initializes a project directory in the scans directory
|
||||||
#
|
#
|
||||||
# TODO
|
# TODO
|
||||||
# * Copy over source image directories from robot - are these alse named by the ExpJobs name?
|
#
|
||||||
# * Suggest renaming ExpJobs to something like "scans" or "images"
|
# * Copy over source image directories from robot - are these alse named by the ExpJobs name?
|
||||||
# * MasterPlate_ file **should not** be an xlsx file, no portability
|
# * Suggest renaming ExpJobs to something like "scans" or "images"
|
||||||
|
# * MasterPlate_ file **should not** be an xlsx file, no portability
|
||||||
#
|
#
|
||||||
# NOTES
|
# NOTES
|
||||||
# * Copy over the images from the robot and then DO NOT TOUCH that directory except to copy from it
|
#
|
||||||
# * Write-protect (read-only) if we need to
|
# * Copy over the images from the robot and then DO NOT TOUCH that directory except to copy from it
|
||||||
# * Copy data from scans/images directory to the project working dir and then begin analysis
|
# * Write-protect (read-only) if we need to
|
||||||
# * You may think...but doesn't that 2x data?
|
# * Copy data from scans/images directory to the project working dir and then begin analysis
|
||||||
# * No, btrfs subvolume uses reflinks, only data that is altered will be duplicated
|
# * You may think...but doesn't that 2x data?
|
||||||
# * Most of the data are static images that are not written to, so the data is deduplicated
|
# * No, btrfs subvolume uses reflinks, only data that is altered will be duplicated
|
||||||
|
# * Most of the data are static images that are not written to, so the data is deduplicated
|
||||||
|
#
|
||||||
init_project() {
|
init_project() {
|
||||||
debug "Running: ${FUNCNAME[0]}"
|
debug "Running: ${FUNCNAME[0]}"
|
||||||
|
|
||||||
@@ -641,47 +670,236 @@ init_project() {
|
|||||||
|
|
||||||
|
|
||||||
module easy
|
module easy
|
||||||
# @section EASY
|
# @description
|
||||||
# @description Start an EASY analysis
|
# Run the EASY matlab program
|
||||||
# TODO Don't create output in the scans folder, put it in an output directory
|
#
|
||||||
# TODO The !!Results output files need standardized naming
|
# TODO
|
||||||
# TODO Don't perform directory operations in EASY
|
#
|
||||||
|
# * Don't create output in the scans folder, put it in an output directory
|
||||||
|
# * The !!Results output files need standardized naming
|
||||||
|
# * The input MasterPlate and DrugMedia sheets need to be converted to something standard like csv/tsv
|
||||||
|
#
|
||||||
|
# NOTES
|
||||||
|
#
|
||||||
|
# * I've modularized EASY to fit into this workflow but there may be things broken (especially in "stand-alone" mode)
|
||||||
# * The scans/images and 'MasterPlateFiles' folder are the inputs for image analysis with EASY software.
|
# * The scans/images and 'MasterPlateFiles' folder are the inputs for image analysis with EASY software.
|
||||||
# * EASY will automatically generate a 'Results' directory (within the ExpJobs/'ExperimentJob' folder) w/ timestamp and an optional short description provided by the user (Fig.2).
|
# * EASY will automatically generate a 'Results' directory (within the ExpJobs/'ExperimentJob' folder) w/ timestamp and an optional short description provided by the user (Fig.2).
|
||||||
# * The 'Results' directory is created and entered, using the "File >> New Experiment" dropdown in EASY.
|
# * The 'Results' directory is created and entered, using the "File >> New Experiment" dropdown in EASY.
|
||||||
# * Multiple 'Results' files may be created (and uniquely named) within an 'ExperimentJob' folder.
|
# * Multiple 'Results' files may be created (and uniquely named) within an 'ExperimentJob' folder.
|
||||||
#
|
#
|
||||||
# Template:
|
# INSTRUCTIONS
|
||||||
# templates/easy
|
#
|
||||||
# * [datatipp.m](templates/easy/datatipp.m)
|
# * This program should handle the relevant directory and file creation and load the correct project into EASY
|
||||||
# * [DgenResults.m](templates/easy/DgenResults.m)
|
#
|
||||||
# * [DMPexcel2mat.m](templates/easy/DMPexcel2mat.m)
|
# #### Pin-tool mapping
|
||||||
# * [EASYconsole.asv](templates/easy/EASYconsole.asv)
|
#
|
||||||
# * [EASYconsole.fig](templates/easy/EASYconsole.fig)
|
# * Select at least two images from your experiment (or another experiment) to place in a 'PTmapFiles' folder.
|
||||||
# * [EASYconsole.m](templates/easy/EASYconsole.m)
|
# * Sometimes an experiment doesn't have a complete set of quality spots for producing a pin tool map that will be used to start the spot search process.
|
||||||
# * [figs](templates/easy/figs)
|
# * In this case the folder for Master Plate 1 (MP 1) is almost good but has a slight problem.
|
||||||
# * [NPTdirect.fig](templates/easy/figs/NPTdirect.fig)
|
# * At P13 the spot is extended. We might use this one but would like to include others that are more centered if possible.
|
||||||
# * [searchNPTIm.fig](templates/easy/figs/searchNPTIm.fig)
|
# * The other plates with higher drug concentrations could be used, but since a recent experiment has a better reference plate image, we will add it to the set of images to produce the pin tool map.
|
||||||
# * [NCdisplayGui.m](templates/easy/NCdisplayGui.m)
|
#
|
||||||
# * [NCfitImCFparforFailGbl2.m](templates/easy/NCfitImCFparforFailGbl2.m)
|
# 
|
||||||
# * [NCscurImCF_3parfor.m](templates/easy/NCscurImCF_3parfor.m)
|
#
|
||||||
# * [NCsingleDisplay.m](templates/easy/NCsingleDisplay.m)
|
# * We will find a good image from another experiment
|
||||||
# * [NIcircle.m](templates/easy/NIcircle.m)
|
#
|
||||||
# * [NImParamRadiusGui.m](templates/easy/NImParamRadiusGui.m)
|
# 
|
||||||
# * [NIscanIntensBGpar4GblFnc.m](templates/easy/NIscanIntensBGpar4GblFnc.m)
|
#
|
||||||
# * [p4loop8c.m](templates/easy/p4loop8c.m)
|
# * We now have some images to generate a composite centered map for EASY to search and find the nucleation area of each spot as it forms.
|
||||||
# * [par4GblFnc8c.m](templates/easy/par4GblFnc8c.m)
|
# * Click the Run menu tab.
|
||||||
# * [par4Gbl_Main8c.m](templates/easy/par4Gbl_Main8c.m)
|
# * A drop down list of options is presented.
|
||||||
# * [PTmats](templates/easy/PTmats)
|
# * Click the first item → [Plate Map Pintool ].
|
||||||
# * [Nbdg.mat](templates/easy/PTmats/Nbdg.mat)
|
#
|
||||||
# * [NCFparms.mat](templates/easy/PTmats/NCFparms.mat)
|
# 
|
||||||
# * [NImParameters.mat](templates/easy/PTmats/NImParameters.mat)
|
#
|
||||||
# * [NPTdirectParameters.mat](templates/easy/PTmats/NPTdirectParameters.mat)
|
# * Open the PTmapFiles folder.
|
||||||
# * [NPTmapDirect.mat](templates/easy/PTmats/NPTmapDirect.mat)
|
# * Then click on the .bmp files you wish to include to make the pin tool map.
|
||||||
# * [NPTmapSearch.mat](templates/easy/PTmats/NPTmapSearch.mat)
|
# * Click the Open button.
|
||||||
# * [NPTsearchParameters.mat](templates/easy/PTmats/NPTsearchParameters.mat)
|
#
|
||||||
|
# 
|
||||||
|
#
|
||||||
|
# * A warning dialog box may appear.
|
||||||
|
# * This is nothing to be concerned about.
|
||||||
|
# * Click OK and continue.
|
||||||
|
#
|
||||||
|
# 
|
||||||
|
#
|
||||||
|
# * 'Retry' takes you back so that to can select a different .bmp files from which to create the map from.
|
||||||
|
# * In this case the spots from the two images are well aligned and give coverage to all the spots therefore we do not have to add new images.
|
||||||
|
# * Remember, this map is just a good guess as to where to start looking for each spot not where it will focus to capture intensities.
|
||||||
|
# * Click 'Open' again.
|
||||||
|
#
|
||||||
|
# 
|
||||||
|
#
|
||||||
|
# * We can now shift these values to get a better 'hard' start for this search.
|
||||||
|
# * Maybe we can move this search point to the left a bit by decreasing the 'Initial Col Position' slightly to 120 and clicking continue.
|
||||||
|
#
|
||||||
|
# 
|
||||||
|
#
|
||||||
|
# * Even though the first result image using 126 may have given a good map, we will use the improve second initiation point by clicking the 'Continue/Finish' button.
|
||||||
|
#
|
||||||
|
# 
|
||||||
|
#
|
||||||
|
# * Note that the red “hot” spot is now well centered in each composite image spot.
|
||||||
|
# * We can now click 'Continue / Finish' to proceed.
|
||||||
|
# * The coordinates and parameters will be stored in the results folder 'PTmats'.
|
||||||
|
# * This is where the .mat files which contain localization data for use in the next section of our work.
|
||||||
|
# * The EASY GUI will come back.
|
||||||
|
# * Now click the 'Run' → 'Image Curve ComboAnalysi'.
|
||||||
|
# * This will perform the image quantification and then generate the curve fits for each plate selected.
|
||||||
|
# * Typically we pick only one plate at a time for one or two master plates.
|
||||||
|
# * The software will present the final image of the search if only 1 master plate is selected.
|
||||||
|
# * If multiple plates are selected, no search images will be presented or stored as figures.
|
||||||
|
# * However all the position data for every spot at every time point will be stored.
|
||||||
|
# * This large data trove is used by EZview to produce the image click-on hot spot maps and the photo strips.
|
||||||
|
#
|
||||||
|
# 
|
||||||
|
#
|
||||||
|
# * Note the 'Select Files' dialog.
|
||||||
|
# * It allow the user to select the specific .bmp files to use.
|
||||||
|
# * This can be useful if there are bad ones that need to be removed from the folder due to contamination.
|
||||||
|
# * If all are good we can select them all and click 'Open' to run the process.
|
||||||
|
# * There are other parameters that can be selected.
|
||||||
|
# * For now we will continue and come back to those later.
|
||||||
|
#
|
||||||
|
# 
|
||||||
|
#
|
||||||
|
# 
|
||||||
|
#
|
||||||
|
# * The search focus of each spot at the end of the run is presented for examination
|
||||||
|
# * Notice that these have floated and locked in to a position determined on the fly to a point where the initial growth has exceed has reach a point of maturity.
|
||||||
|
# * This prevents a jump to a late onset jump to a contamination site.
|
||||||
|
# * If we found that we need to adjust our pin tool map or make other modifications, we can do that and rerun these single runs until we are satisfied.
|
||||||
|
#
|
||||||
|
# 
|
||||||
|
#
|
||||||
|
# * Next we will run the entire experiment by clicking on all the other master plates from the list box.
|
||||||
|
#
|
||||||
|
# 
|
||||||
|
#
|
||||||
|
# * Depending on the number of master plates and the number of time point images taken for each, this next step can take a while.
|
||||||
|
# * Click continue and do something else while the computer works.
|
||||||
|
# * When the work is complete the EASY GUI will reappear without the master plate list.
|
||||||
|
# * Now look in the /Results* /PrintResults folder to check that all the plates have run and produced data.
|
||||||
|
#
|
||||||
|
# 
|
||||||
|
#
|
||||||
|
# * This is a legacy print copy of data, but is still useful to check that all the quantification was completed successfully.
|
||||||
|
#
|
||||||
|
# 
|
||||||
|
#
|
||||||
|
# #### Generate Reports
|
||||||
|
#
|
||||||
|
# * Generate a MPDM.mat file from the Excel master plate and drug media sheets that the user prepared as part of the experiment preparation.
|
||||||
|
# * These sheets must conform to certain format rules.
|
||||||
|
# * It is best when creating these to use a working copy as a template and replace the data with that of the current experiment.
|
||||||
|
# * See Master Plate and Drug-Media Plate topic for details.
|
||||||
|
# * Click on the 'GenReports' menu tab and a drop down menu is presented the the first item 'DrugMediaMP Generate .mat'.
|
||||||
|
# * This will take you to the /MasterPlateFiles folder within the experiment currently being analyzed.
|
||||||
|
# * Do as the dialog box instructs. Select the Master Plate Excel file first.
|
||||||
|
# * Important note: These files (both for master plates and drug-medias) must be generated or converted to the Excel 95 version to be read in Linux.
|
||||||
|
# * This can be done on either a Windows or an Apple machine running Excel.
|
||||||
|
#
|
||||||
|
#
|
||||||
|
# 
|
||||||
|
#
|
||||||
|
# 
|
||||||
|
#
|
||||||
|
# 
|
||||||
|
#
|
||||||
|
# 
|
||||||
|
#
|
||||||
|
# * A message dialog pops up.
|
||||||
|
# * Click 'OK'.
|
||||||
|
#
|
||||||
|
# 
|
||||||
|
#
|
||||||
|
# * Next click on the 'GenReports' menu tab and the second item in the drop down list 'ResultsDB Generate'.
|
||||||
|
#
|
||||||
|
# 
|
||||||
|
#
|
||||||
|
# * A dialog box with options appears.
|
||||||
|
# * The default is 'Both'.
|
||||||
|
# * 'Res' produces only a standard result sheet in the current experiments /Results*/PrintResults folder.
|
||||||
|
# * 'DB' produces only a special file for convenient upload to databases.
|
||||||
|
# * This file has no blank rows separating the plates and combines the raw data for each line item into a 'blob' as this is a convenient way to store data of variant lengths in a single database field.
|
||||||
|
# * The concatenation of data for each row take a while. But is useful for uploading data.
|
||||||
|
# * Typically 'Both' is the preferred option, however, if one needs to review the results quickly, this provides that option.
|
||||||
|
#
|
||||||
|
# * We can open the !!Results MI 16_0919 yor1-1 copy.txt text file using Libre Open Office to review the results.
|
||||||
|
#
|
||||||
|
# 
|
||||||
|
#
|
||||||
|
# * We can do likewise with the !!Dbase_MI 16_0919_yor1-2 copy.txt text file.
|
||||||
|
#
|
||||||
|
# 
|
||||||
|
#
|
||||||
|
# * Note that there are no headers or empty rows.
|
||||||
|
# * Since Libre may corrupt the text files, it could be advisable to only read them and refrain from any 'Save' options presented.
|
||||||
|
#
|
||||||
|
# #### Master Plate and Drug Media Spreadsheets
|
||||||
|
#
|
||||||
|
# * The Master Plate and Drug- Media Spreadsheets correlate to the collected and calculated data with the defining definitions of the cultures, drugs and media involved in producing the experimental data.
|
||||||
|
# * These spreadsheets have a very specific format which was instigated at the beginning of our work.
|
||||||
|
# * To maintain compatibility over the years, we maintain that format.
|
||||||
|
# * To begin with, our system can be used with Windows, Linux and Apple operating systems.
|
||||||
|
# * To accommodate these OS's, the Excel version must be an older Excel 95 version which is cross compatible for Matlab versions within all three major OS's.
|
||||||
|
# * Windows is more tolerant, but to avoid problems producing results reports, ALWAYS use the Excel 95 format for your spreadsheets.
|
||||||
|
# * Do not remove any header rows. They can be modified with exception of the triple hash (###).
|
||||||
|
# * Do not change the number or order of the columns.
|
||||||
|
# * Next place a 'space' in unused empty spreadsheet entry positions.
|
||||||
|
# * This can cause problems in general for some software utilities.
|
||||||
|
# * It is just best to make this a standard practice.
|
||||||
|
# * Avoid using special characters.
|
||||||
|
# * Depending on the OS and software utility (especially database utilities), these can be problematic.
|
||||||
|
# * Certain 'date' associated entries such as Oct1 or OCT1 will be interpreted by Excel as a date and automatically formatted as such.
|
||||||
|
# * Do not use Oct1 (which is a yeast gene name) instead use Oct1_ or it's ORF name instead.
|
||||||
|
# * When creating a Master Plate spreadsheet, it is best to start with a working spreadsheet template and adjust it to your descriptive data.
|
||||||
|
# * Be sure that ### mark is always present in the first column of the header for each plate.
|
||||||
|
# * This is important convention as it is used to defined a new plate set of entry data.
|
||||||
|
# * Each plate is expected to have 384 rows of data correlated with the 384 wells of the source master plates.
|
||||||
|
# * These have a particular order going through all 24 columns each row before proceeding to the next row.
|
||||||
|
# * Gene names and ORF name entries should be as short as possible (4-5 character max if possible) as these are used repeatedly as part of concatenated descriptors.
|
||||||
|
# * The 'Replicate' field and the 'Specifics' fields can be used for additional information.
|
||||||
|
# * The 'Replicate' field was originally designed to allow the user to sort replicates but it can be used for other relevant information.
|
||||||
|
# * The 'Specifics' field was created to handle special cases where the liquid media in which cultures were grown on a single source plate was selectively varied.
|
||||||
|
# * This gives the researcher a way to sort by growth media as well as gene or ORF name.
|
||||||
|
# * It can also be used to sort other properties instead of modifying the gene name field.
|
||||||
|
# * Thoughtful experiment design and layout are important for the successful analysis of the resultant data.
|
||||||
|
# * It is typically a good idea to create at least one reference full plate and make that plate the first source master plate.
|
||||||
|
# * Typically we give those reference cultures the 'Gene Name' RF1.
|
||||||
|
# * Traditionally we also made a second full reference plate with its cultures labeled RF2.
|
||||||
|
# * More recently some researchers have gone to dispersing RF1 control reference cultures throughout the source master plates series in addition to the first full source master plate.
|
||||||
|
# * The EZview software has been updated accordingly to find these references and perform associated calculations.
|
||||||
|
#
|
||||||
|
# 
|
||||||
|
#
|
||||||
|
# * There are a number of fields on the spreadsheet which in this case were left empty.
|
||||||
|
# * This spreadsheet format was created originally with studies of whole yeast genome SGA modifications incorporated.
|
||||||
|
# * Therefore all fields may not be relevant.
|
||||||
|
# * However, when ever relevant it is strongly advised to fill in all the appropriate data.
|
||||||
|
# * The Drug-Media spreadsheet defines the perturbation components of each type of agar plate that the source master plates are printed on.
|
||||||
|
# * Again the format adherence is essential.
|
||||||
|
# * There is a '1' in the first column- second row (A2).
|
||||||
|
# * This has as legacy going back to early use.
|
||||||
|
# * It is still necessary and should not be deleted.
|
||||||
|
# * The header row must not be deleted.
|
||||||
|
# * A triple hash(###) must be placed in the cell below the last entry in the Drug field (Column 2).
|
||||||
|
# * Again insert a 'space' in each unused or empty cell in each field.
|
||||||
|
# * Again avoid special characters which may cause problems if not in the experiment quantification in subsequent analysis utilities.
|
||||||
|
# * A utility looking for a text field may end up reading a null and respond inappropriately.
|
||||||
|
# * As with the master plate Excel sheet, it is a good idea to use a working copy of an existing Drug-Media spreadsheet and adapt it to ones needs.
|
||||||
|
#
|
||||||
|
# 
|
||||||
|
#
|
||||||
|
#
|
||||||
|
#
|
||||||
|
#
|
||||||
|
#
|
||||||
|
3
|
||||||
#
|
#
|
||||||
# To analyze a new Q-HTCP experiment:
|
# To analyze a new Q-HTCP experiment:
|
||||||
|
#
|
||||||
# * Open the EASY Software.
|
# * Open the EASY Software.
|
||||||
# * Open 'EstartConsole.m' with MATLAB
|
# * Open 'EstartConsole.m' with MATLAB
|
||||||
# * Click the Run icon (play button)
|
# * Click the Run icon (play button)
|
||||||
@@ -715,6 +933,9 @@ module easy
|
|||||||
# * When finished, the '!!ResultsStd_.txt' will be about the same file size and it should be used in the following StudiesQHTCP analysis.
|
# * When finished, the '!!ResultsStd_.txt' will be about the same file size and it should be used in the following StudiesQHTCP analysis.
|
||||||
# * 'NoGrowth_.txt', and 'GrowthOnly_.txt' files will be generated in the 'PrintResults' folder.
|
# * 'NoGrowth_.txt', and 'GrowthOnly_.txt' files will be generated in the 'PrintResults' folder.
|
||||||
#
|
#
|
||||||
|
#
|
||||||
|
#
|
||||||
|
#
|
||||||
# Issues:
|
# Issues:
|
||||||
# * We need full documentation for all of the current workflow. There are different documents that need to be integrated. This will need to be updated as we make improvements to the system.
|
# * We need full documentation for all of the current workflow. There are different documents that need to be integrated. This will need to be updated as we make improvements to the system.
|
||||||
# * MasterPlate_ file must have ydl227c in orf column, or else it Z_interaction.R will fail, because it can't calculate shift values.
|
# * MasterPlate_ file must have ydl227c in orf column, or else it Z_interaction.R will fail, because it can't calculate shift values.
|
||||||
@@ -826,7 +1047,6 @@ easy() {
|
|||||||
|
|
||||||
|
|
||||||
module ezview
|
module ezview
|
||||||
# @section EZview
|
|
||||||
# @description TODO WIP
|
# @description TODO WIP
|
||||||
ezview() {
|
ezview() {
|
||||||
debug "Running: ${FUNCNAME[0]}"
|
debug "Running: ${FUNCNAME[0]}"
|
||||||
@@ -844,18 +1064,21 @@ ezview() {
|
|||||||
|
|
||||||
|
|
||||||
module qhtcp
|
module qhtcp
|
||||||
# @section QHTCP
|
|
||||||
# @description System for Multi-QHTCP-Experiment Gene Interaction Profiling Analysis
|
# @description System for Multi-QHTCP-Experiment Gene Interaction Profiling Analysis
|
||||||
|
#
|
||||||
# * Functional rewrite of REMcMaster3.sh, RemcMaster2.sh, REMcJar2.sh, ExpFrontend.m, mProcess.sh, mFunction.sh, mComponent.sh
|
# * Functional rewrite of REMcMaster3.sh, RemcMaster2.sh, REMcJar2.sh, ExpFrontend.m, mProcess.sh, mFunction.sh, mComponent.sh
|
||||||
# * Added a newline character to the end of StudyInfo.csv so it is a valid text file
|
# * Added a newline character to the end of StudyInfo.csv so it is a valid text file
|
||||||
# TODO Suggest renaming StudiesQHTCP to something like qhtcp qhtcp_output or output
|
#
|
||||||
# TODO Store StudyInfo somewhere better
|
# TODO
|
||||||
# TODO Move (hide) the study template somewhere else
|
#
|
||||||
# TODO StudiesArchive should be smarter:
|
# * Suggest renaming StudiesQHTCP to something like qhtcp qhtcp_output or output
|
||||||
|
# * Store StudyInfo somewhere better
|
||||||
|
# * Move (hide) the study template somewhere else
|
||||||
|
# * StudiesArchive should be smarter:
|
||||||
# * Create a database with as much information as possible
|
# * Create a database with as much information as possible
|
||||||
# * Write a function that easily loads and parses databse into easy-to-use variables
|
# * Write a function that easily loads and parses databse into easy-to-use variables
|
||||||
# * Allow users to reference those variables to write their own modules
|
# * Allow users to reference those variables to write their own modules
|
||||||
# TODO Should not be using initials
|
# * Should not be using initials
|
||||||
# * not unique enough and we don't have that data easily on hand
|
# * not unique enough and we don't have that data easily on hand
|
||||||
# * usernames are unique and make more sense
|
# * usernames are unique and make more sense
|
||||||
# * I don't know what all would have to be modified atm
|
# * I don't know what all would have to be modified atm
|
||||||
@@ -883,7 +1106,7 @@ module qhtcp
|
|||||||
# * When prompted, navigate to the ExpJobs folder and the PrintResults folder within the correct job folder.
|
# * When prompted, navigate to the ExpJobs folder and the PrintResults folder within the correct job folder.
|
||||||
# * Repeat this for every Exp# folder depending on how many experiments are being performed.
|
# * Repeat this for every Exp# folder depending on how many experiments are being performed.
|
||||||
# * Note: Before doing this, it's a good idea to compare the ref and non-ref CPP average and median values. If they are not approximately equal, then may be helpful to standardize Ref values to the measures of central tendency of the Non-refs, because the Ref CPPs are used for the z-scores, which should be centered around zero.
|
# * Note: Before doing this, it's a good idea to compare the ref and non-ref CPP average and median values. If they are not approximately equal, then may be helpful to standardize Ref values to the measures of central tendency of the Non-refs, because the Ref CPPs are used for the z-scores, which should be centered around zero.
|
||||||
# * This script will copy the !!ResultsStd file (located in /PrintResults in the relevant job folder in /ExpJobs **rename this !!Results file before running front end; we normally use the 'STD' (not the 'ELR' file) chosen to the Exp# directory as can be seen in the “Current Folder” column in MATLAB, and it updates 'StudiesDataArchive.txt' file that resides in the /StudiesQHTCP folder. 'StudiesDataArchive.txt' is a log of file paths used for different studies, including timestamps.
|
# * This script will copy the !!ResultsStd file (located in /PrintResults in the relevant job folder in /scans **rename this !!Results file before running front end; we normally use the 'STD' (not the 'ELR' file) chosen to the Exp# directory as can be seen in the “Current Folder” column in MATLAB, and it updates 'StudiesDataArchive.txt' file that resides in the /StudiesQHTCP folder. 'StudiesDataArchive.txt' is a log of file paths used for different studies, including timestamps.
|
||||||
#
|
#
|
||||||
# Do this to document the names, dates and paths of all the studies and experiment data used in each study. Note, one should only have a single '!!Results…' file for each /Exp_ to prevent ambiguity and confusion. If you decide to use a new or different '!!Results…' sheet from what was used in a previous “QHTCP Study”, remove the one not being used. NOTE: if you copy a '!!Results…' file in by hand, it will not be recorded in the 'StudiesDataArchive.txt' file and so will not be documented for future reference. If you use the ExpFrontend.m utility it will append the new source for the raw !!Results… to the 'StudiesDataArchive.txt' file.
|
# Do this to document the names, dates and paths of all the studies and experiment data used in each study. Note, one should only have a single '!!Results…' file for each /Exp_ to prevent ambiguity and confusion. If you decide to use a new or different '!!Results…' sheet from what was used in a previous “QHTCP Study”, remove the one not being used. NOTE: if you copy a '!!Results…' file in by hand, it will not be recorded in the 'StudiesDataArchive.txt' file and so will not be documented for future reference. If you use the ExpFrontend.m utility it will append the new source for the raw !!Results… to the 'StudiesDataArchive.txt' file.
|
||||||
# As stated above, it is advantageous to think about the comparisons one wishes to make so as to order the experiments in a rational way as it relates to the presentation of plots. That is, which results from sheets and selected 'interaction … .R', user modified script, is used in /Exp1, Exp2, Exp3 and Exp4 as explained in the following section.
|
# As stated above, it is advantageous to think about the comparisons one wishes to make so as to order the experiments in a rational way as it relates to the presentation of plots. That is, which results from sheets and selected 'interaction … .R', user modified script, is used in /Exp1, Exp2, Exp3 and Exp4 as explained in the following section.
|
||||||
@@ -893,7 +1116,9 @@ module qhtcp
|
|||||||
# As stated earlier, the user can add folders to back up temporary results, study-related notes, or other related work.
|
# As stated earlier, the user can add folders to back up temporary results, study-related notes, or other related work.
|
||||||
# However, it is advised to set up and use separate STUDIES when evaluating differing data sets whether that is from experiment results files or from differing data selections in the first interaction … .R script stage.
|
# However, it is advised to set up and use separate STUDIES when evaluating differing data sets whether that is from experiment results files or from differing data selections in the first interaction … .R script stage.
|
||||||
# This reduces confusion at the time of the study and especially for those reviewing study analysis in the future.
|
# This reduces confusion at the time of the study and especially for those reviewing study analysis in the future.
|
||||||
# How-To Procedure: Execute a Multi-experiment Study
|
#
|
||||||
|
# How-To Procedure: Execute a Multi-experiment Study:
|
||||||
|
#
|
||||||
# * Consider the goals of the study and design a strategy of experiments to include in the study.
|
# * Consider the goals of the study and design a strategy of experiments to include in the study.
|
||||||
# * Consider the quality of the experiment runs using EZview to see if there are systematic problems that are readily detectable.
|
# * Consider the quality of the experiment runs using EZview to see if there are systematic problems that are readily detectable.
|
||||||
# * In some cases, one may wish to design a 'pilot' study for discovery purposes.
|
# * In some cases, one may wish to design a 'pilot' study for discovery purposes.
|
||||||
@@ -1032,9 +1257,10 @@ qhtcp() {
|
|||||||
|
|
||||||
|
|
||||||
module remc
|
module remc
|
||||||
# @section remc
|
|
||||||
# @description remc module for QHTCP
|
# @description remc module for QHTCP
|
||||||
|
#
|
||||||
# TODO
|
# TODO
|
||||||
|
#
|
||||||
# * Which components can be parallelized?
|
# * Which components can be parallelized?
|
||||||
# @arg $1 string studyInfo file
|
# @arg $1 string studyInfo file
|
||||||
remc() {
|
remc() {
|
||||||
@@ -1067,7 +1293,6 @@ remc() {
|
|||||||
|
|
||||||
|
|
||||||
module gtf
|
module gtf
|
||||||
# @section GTF
|
|
||||||
# @description GTF module for QHTCP
|
# @description GTF module for QHTCP
|
||||||
# @arg $1 string output directory
|
# @arg $1 string output directory
|
||||||
# @arg $2 string gene_association.sgd
|
# @arg $2 string gene_association.sgd
|
||||||
@@ -1119,11 +1344,13 @@ gtf() {
|
|||||||
|
|
||||||
|
|
||||||
module gta
|
module gta
|
||||||
# @section GTA
|
|
||||||
# @description GTA module for QHTCP
|
# @description GTA module for QHTCP
|
||||||
|
#
|
||||||
# TODO
|
# TODO
|
||||||
|
#
|
||||||
# *
|
# *
|
||||||
# *
|
# *
|
||||||
|
#
|
||||||
# @arg $1 string output directory
|
# @arg $1 string output directory
|
||||||
# @arg $2 string gene_association.sgd
|
# @arg $2 string gene_association.sgd
|
||||||
# @arg $3 string gene_ontology_edit.obo
|
# @arg $3 string gene_ontology_edit.obo
|
||||||
@@ -1205,35 +1432,45 @@ gta() {
|
|||||||
|
|
||||||
|
|
||||||
# @section Submodules
|
# @section Submodules
|
||||||
# @description Submodules are shell wrappers for workflow components in external languages.
|
# @description
|
||||||
|
#
|
||||||
|
# Submodules are shell wrappers for workflow components in external languages
|
||||||
|
#
|
||||||
# Submodules:
|
# Submodules:
|
||||||
# * Allow scripts to be called by the main workflow script using input\
|
#
|
||||||
# and output arguments as a translation mechanism.
|
# * Allow scripts to be called by the main workflow script using input and output arguments as a translation mechanism.
|
||||||
# * Only run by default if called by a module.
|
# * Only run by default if called by a module.
|
||||||
# * Can be called directly with its arguments as a comma-separated string
|
# * Can be called directly with its arguments as a comma-separated string
|
||||||
|
#
|
||||||
|
# @description
|
||||||
|
submodule() {
|
||||||
|
debug "Adding $1 submodule"
|
||||||
|
ALL_SUBMODULES+=("$1")
|
||||||
|
declare -gA "$1"
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
submodule r_gta
|
submodule r_gta
|
||||||
# @description GTAtemplate R script
|
# @description GTAtemplate R script
|
||||||
# TODO:
|
#
|
||||||
|
# TODO
|
||||||
|
#
|
||||||
# * Is GTAtemplate.R actually a template?
|
# * Is GTAtemplate.R actually a template?
|
||||||
# * Do we need to allow user customization?
|
# * Do we need to allow user customization?
|
||||||
#
|
#
|
||||||
# Files:
|
# Files
|
||||||
# * gene_association.sgd: https://downloads.yeastgenome.org/curation/chromosomal_feature/gene_association.sgd
|
#
|
||||||
|
# * [gene_association.sgd](https://downloads.yeastgenome.org/curation/chromosomal_feature/gene_association.sgd)
|
||||||
# * go_terms.tab
|
# * go_terms.tab
|
||||||
#
|
#
|
||||||
# Output:
|
# Output
|
||||||
# *
|
|
||||||
#
|
#
|
||||||
# This submodule:
|
|
||||||
# *
|
|
||||||
# *
|
# *
|
||||||
#
|
#
|
||||||
# @arg $1 string Exp# name
|
# @arg $1 string Exp# name
|
||||||
# @arg $2 string ZScores_Interaction.csv file
|
# @arg $2 string ZScores_Interaction.csv file
|
||||||
# @arg $3 string go_terms.tab file
|
# @arg $3 string go_terms.tab file
|
||||||
# @arg $4 string gene_association.sgd
|
# @arg $4 string [gene_association.sgd](https://downloads.yeastgenome.org/curation/chromosomal_feature/gene_association.sgd)
|
||||||
# @arg $5 string output directory
|
# @arg $5 string output directory
|
||||||
#
|
#
|
||||||
r_gta() {
|
r_gta() {
|
||||||
@@ -1256,17 +1493,22 @@ r_gta() {
|
|||||||
|
|
||||||
submodule r_gta_pairwiselk
|
submodule r_gta_pairwiselk
|
||||||
# @description PairwiseLK.R R script
|
# @description PairwiseLK.R R script
|
||||||
# TODO:
|
#
|
||||||
|
# TODO
|
||||||
|
#
|
||||||
# * Should move directory creation from PairwiseLK.R to gta module
|
# * Should move directory creation from PairwiseLK.R to gta module
|
||||||
#
|
#
|
||||||
# Files:
|
# Files
|
||||||
|
#
|
||||||
# *
|
# *
|
||||||
# *
|
# *
|
||||||
#
|
#
|
||||||
# Output:
|
# Output
|
||||||
|
#
|
||||||
# *
|
# *
|
||||||
#
|
#
|
||||||
# This submodule:
|
# This submodule:
|
||||||
|
#
|
||||||
# * Will perform both L and K comparisons for the specified experiment folders.
|
# * Will perform both L and K comparisons for the specified experiment folders.
|
||||||
# * The code uses the naming convention of PairwiseCompare_Exp’#’-Exp’#’ to standardize and keep simple the structural naming (where ‘X’ is either K or L and ‘Y’ is the number of the experiment GTA results to be found in ../GTAresult/Exp_).
|
# * The code uses the naming convention of PairwiseCompare_Exp’#’-Exp’#’ to standardize and keep simple the structural naming (where ‘X’ is either K or L and ‘Y’ is the number of the experiment GTA results to be found in ../GTAresult/Exp_).
|
||||||
# * {FYI There are also individual scripts that just do the ‘L’ or ‘K’ pairwise studies in the ../Code folder.}
|
# * {FYI There are also individual scripts that just do the ‘L’ or ‘K’ pairwise studies in the ../Code folder.}
|
||||||
@@ -1293,19 +1535,24 @@ r_gta_pairwiselk() {
|
|||||||
|
|
||||||
submodule r_gta_heatmaps
|
submodule r_gta_heatmaps
|
||||||
# @description TSHeatmaps5dev2.R R script
|
# @description TSHeatmaps5dev2.R R script
|
||||||
# TODO:
|
#
|
||||||
|
# TODO
|
||||||
|
#
|
||||||
# * Script could use rename
|
# * Script could use rename
|
||||||
# * Script should be refactored to automatically allow more studies
|
# * Script should be refactored to automatically allow more studies
|
||||||
# * Script should be refactored with more looping to reduce verbosity
|
# * Script should be refactored with more looping to reduce verbosity
|
||||||
#
|
#
|
||||||
# Files:
|
# Files
|
||||||
|
#
|
||||||
# *
|
# *
|
||||||
# *
|
# *
|
||||||
#
|
#
|
||||||
# Output:
|
# Output
|
||||||
|
#
|
||||||
# *
|
# *
|
||||||
#
|
#
|
||||||
# This submodule:
|
# This submodule:
|
||||||
|
#
|
||||||
# * The Term Specific Heatmaps are produced directly from the ../ExpStudy/Exp_/ZScores/ZScores_Interaction.csv file generated by the user modified interaction… .R script.
|
# * The Term Specific Heatmaps are produced directly from the ../ExpStudy/Exp_/ZScores/ZScores_Interaction.csv file generated by the user modified interaction… .R script.
|
||||||
# * The heatmap labeling is per the names the user wrote into the StudyInfo.txt spreadsheet.
|
# * The heatmap labeling is per the names the user wrote into the StudyInfo.txt spreadsheet.
|
||||||
# * Verify that the All_SGD_GOTerms_for_QHTCPtk.csv found in ../Code is what you wish to use or if you wish to use a custom modified version.
|
# * Verify that the All_SGD_GOTerms_for_QHTCPtk.csv found in ../Code is what you wish to use or if you wish to use a custom modified version.
|
||||||
@@ -1332,51 +1579,17 @@ r_gta_heatmaps() {
|
|||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
# submodule mat_exp_frontend
|
|
||||||
# # @description Run the ExpFrontend.m program
|
|
||||||
# # This submodule:
|
|
||||||
# # * Pushes into the Study template directory (ExpTemplate)
|
|
||||||
# # * Prompts the user to run ExpFrontend.m
|
|
||||||
# # * Pops out
|
|
||||||
# # NOTES:
|
|
||||||
# # * ExpFrontend.m should be or is being rewritten
|
|
||||||
# mat_exp_frontend() {
|
|
||||||
# debug "Running: ${FUNCNAME[0]}"
|
|
||||||
# cat <<-EOF
|
|
||||||
# ExpFrontend.m was made for recording into a spreadsheet
|
|
||||||
# ('StudiesDataArchive.txt') the date and files used (i.e., directory paths to the
|
|
||||||
# !!Results files used as input for Z-interaction script) for each multi-experiment study.
|
|
||||||
|
|
||||||
# Run the front end MATLAB programs in the correct order (e.g., run front end in "exp1"
|
|
||||||
# folder to call the !!Results file for the experiment you named as exp1 in the StudyInfo.csv file)
|
|
||||||
# The GTA and pairwise, TSHeatmaps, JoinInteractions and GTF Heatmap scripts use this table
|
|
||||||
# to label results and heatmaps in a meaningful way for the user and others.
|
|
||||||
# The BackgroundSD and ZscoreJoinSD fields will be filled automatically according to user
|
|
||||||
# specifications, at a later step in the QHTCP study process.
|
|
||||||
|
|
||||||
# COpen MATLAB and in the application navigate to each specific /Exp folder,
|
|
||||||
# call and execute ExpFrontend.m by clicking the play icon.
|
|
||||||
# Use the "Open file" function from within Matlab.
|
|
||||||
# Do not double-click on the file from the directory.
|
|
||||||
# When prompted, navigate to the ExpJobs folder and the PrintResults folder within the correct job folder.
|
|
||||||
# Repeat this for every Exp# folder depending on how many experiments are being performed.
|
|
||||||
# The Exp# folder must correspond to the StudyInfo.csv created above.
|
|
||||||
# EOF
|
|
||||||
|
|
||||||
# script="ExpFrontend.m"
|
|
||||||
# if ! ((YES)) &&
|
|
||||||
# ask "Start MATLAB to run $script? This requires a GUI."; then
|
|
||||||
# $MATLAB -nosplash -r "$script"
|
|
||||||
# fi
|
|
||||||
# }
|
|
||||||
|
|
||||||
|
|
||||||
submodule r_interactions
|
submodule r_interactions
|
||||||
# @description Run the R interactions analysis (Z_InteractionTemplate.R)
|
# @description Run the R interactions analysis (Z_InteractionTemplate.R)
|
||||||
# TODO
|
#
|
||||||
# * don't want to rename Z_InteractionTemplate.R because that will break logic, just edit in place instead
|
# TODO
|
||||||
|
#
|
||||||
|
# * Don't want to rename Z_InteractionTemplate.R because that will break logic, just edit in place instead
|
||||||
|
#
|
||||||
# NOTES
|
# NOTES
|
||||||
|
#
|
||||||
|
# *
|
||||||
|
#
|
||||||
# @arg $1 string The current working directory
|
# @arg $1 string The current working directory
|
||||||
r_interactions() {
|
r_interactions() {
|
||||||
debug "Running: ${FUNCNAME[0]}"
|
debug "Running: ${FUNCNAME[0]}"
|
||||||
@@ -1402,10 +1615,13 @@ r_interactions() {
|
|||||||
|
|
||||||
submodule r_join_interactions
|
submodule r_join_interactions
|
||||||
# @description JoinInteractExps3dev.R creates REMcRdy_lm_only.csv and Shift_only.csv
|
# @description JoinInteractExps3dev.R creates REMcRdy_lm_only.csv and Shift_only.csv
|
||||||
# Output files:
|
#
|
||||||
# * REMcRdy_lm_only.csv
|
# Output
|
||||||
# * Shift_only.csv
|
#
|
||||||
# * parameters.csv
|
# * REMcRdy_lm_only.csv
|
||||||
|
# * Shift_only.csv
|
||||||
|
# * parameters.csv
|
||||||
|
#
|
||||||
# @arg $1 string The output directory
|
# @arg $1 string The output directory
|
||||||
# @arg $2 string The sd value
|
# @arg $2 string The sd value
|
||||||
# @arg $3 string The studyInfo file
|
# @arg $3 string The studyInfo file
|
||||||
@@ -1427,12 +1643,19 @@ r_join_interactions() {
|
|||||||
|
|
||||||
submodule java_extract
|
submodule java_extract
|
||||||
# @description Jingyu's REMc java utility
|
# @description Jingyu's REMc java utility
|
||||||
# Input file:
|
#
|
||||||
# * REMcRdy_lm_only.csv
|
# Input
|
||||||
# Output file:
|
#
|
||||||
# * REMcRdy_lm_only.csv-finalTable.csv
|
# * REMcRdy_lm_only.csv
|
||||||
# NOTE:
|
#
|
||||||
# * Closed-source w/ hardcoded output directory, so have to pushd/popd to run (not ideal)
|
# Output
|
||||||
|
#
|
||||||
|
# * REMcRdy_lm_only.csv-finalTable.csv
|
||||||
|
#
|
||||||
|
# NOTE
|
||||||
|
#
|
||||||
|
# * Closed-source w/ hardcoded output directory, so have to pushd/popd to run (not ideal)
|
||||||
|
#
|
||||||
# @arg $1 string The output directory
|
# @arg $1 string The output directory
|
||||||
java_extract() {
|
java_extract() {
|
||||||
debug "Running: ${FUNCNAME[0]}"
|
debug "Running: ${FUNCNAME[0]}"
|
||||||
@@ -1530,7 +1753,9 @@ r_heat_maps_homology() {
|
|||||||
|
|
||||||
submodule py_gtf_dcon
|
submodule py_gtf_dcon
|
||||||
# @description Perform python dcon portion of GTF
|
# @description Perform python dcon portion of GTF
|
||||||
# Output file:
|
#
|
||||||
|
# Output
|
||||||
|
#
|
||||||
# * 1-0-0-finaltable.csv
|
# * 1-0-0-finaltable.csv
|
||||||
# @arg $1 string Directory to process
|
# @arg $1 string Directory to process
|
||||||
# @arg $2 string Output directory name
|
# @arg $2 string Output directory name
|
||||||
@@ -1610,10 +1835,14 @@ r_compile_gtf() {
|
|||||||
|
|
||||||
submodule get_studies
|
submodule get_studies
|
||||||
# @description Parse study names from StudyInfo.csv files
|
# @description Parse study names from StudyInfo.csv files
|
||||||
# TODO: This whole submodule should eventually be either
|
#
|
||||||
# * Removed
|
# TODO
|
||||||
# * Expanded into a file that stores all project/study settings (database)
|
#
|
||||||
# I had to had a new line to the end of StudyInfo.csv, may break things?
|
# * This whole submodule should eventually be either
|
||||||
|
# * Removed
|
||||||
|
# * Expanded into a file that stores all project/study settings (database)
|
||||||
|
# * I had to had a new line to the end of StudyInfo.csv, may break things?
|
||||||
|
#
|
||||||
# Example:
|
# Example:
|
||||||
# ExpNumb,ExpLabel,BackgroundSD,ZscoreJoinSD,AnalysisBy
|
# ExpNumb,ExpLabel,BackgroundSD,ZscoreJoinSD,AnalysisBy
|
||||||
# 1,ExpName1,NA,NA,UserInitials
|
# 1,ExpName1,NA,NA,UserInitials
|
||||||
@@ -1685,14 +1914,9 @@ get_studies() {
|
|||||||
|
|
||||||
|
|
||||||
submodule documentation
|
submodule documentation
|
||||||
# @section Documentation
|
# @description Generates shdoc markdown from this script
|
||||||
# @description Generates markdown documentation from this script using shdoc
|
|
||||||
#
|
|
||||||
# TODO
|
|
||||||
# * We can include images in the markdown file but not natively with shdoc
|
|
||||||
# * Need to add a post processor
|
|
||||||
# * Or use a 'veryuniqueword' and some fancy sed
|
|
||||||
# @noargs
|
# @noargs
|
||||||
|
# @internal
|
||||||
documentation() {
|
documentation() {
|
||||||
debug "Running: ${FUNCNAME[0]}"
|
debug "Running: ${FUNCNAME[0]}"
|
||||||
# Print markdown to stdout
|
# Print markdown to stdout
|
||||||
|
|||||||
Reference in New Issue
Block a user