QHTCP - Hartman Lab User’s Guide
Overview and Introduction to Directory Structure
There should be at least 4 subdirectories to organize Q-HTCP data and analysis. The parent directory is simply called ‘Q-HTCP’ and the 4 are subdirectories described below (Fig. 1):
‘ExpJobs’- This directory contains raw image data and image analysis results for the entire collection of Q-HTCP experiments. We recommend each subdirectory within ‘ExpJobs” should represent a single Q-HTCP experiment and be named using the following convention (AB yyyy_mmdd_PerturbatationsOfInterest): experimenter initials (‘AB ‘), date (‘yyyy_mmdd_’), and brief description (‘drugs_medias’). Each subdirectory contains the Raw Image Folders for that experiment (a series of N folders with successive integer labels 1 to N, each folder containing the time series of images for a single cell array). It also contains a user-supplied subfolder, which must be named ‘’MasterPlateFiles” and must contain two excel files, one named ‘DrugMedia_*experimentdescription*’ and the other named ‘MasterPlate_*experimentdescription*’. The bolded part of the file name including the underscore is required. The italicized part is optional description. Generally the ‘DrugMedia_’ file merits description. If the standard MasterPlate_Template file is being used, it’s not needed to customize then name. On the other hand if the template is modified, it is recommended to rename it and describe accordingly - a useful convention is to use the same name for the MP files as given to the experiment (i.e, the parent ExpJobs subdirectory described above) after the underscores. The ‘MasterPlate_’ file contain associated cell array information (culture IDs for all of the cell arrays in the experiment) while the ‘DrugMedia_’ file contains information about the media that the cell array is printed to. Together they encapsulate and define the experimental design. The QHTCPImageFolders and ‘MasterPlateFiles’ folder are the inputs for image analysis with EASY software. As further described below, EASY will automatically generate a ‘Results’ directory (within the ExpJobs/‘ExperimentJob’ folder) with a name that consists of a system-generated timestamp and an optional short description provided by the user (Fig.2). The ‘Results’ directory is created and entered, using the “File >> New Experiment” dropdown in EASY. Multiple ‘Results’ files may be created (and uniquely named) within an ‘ExperimentJob’ folder.
‘EASY’- This directory contains the GUI-enabled MATLAB software to accomplish image analysis and growth curve fitting. EASY analyzes Q-HTCP image data within an ‘ExperimentJob’’ folder (described above; each cell array has its own folder containing its entire time series of images). EASY analysis produces image quantification data and growth curve fitting results for each cell array; these results are subsequently assembled into a single file and labeled, using information contained in the ‘MasterPlate_’ and ‘DrugMedia_’ files in the ‘MasterPlateFiles’ subdirectory. The final files (named ‘!!ResultsStd_.txt’ or ‘!!ResultsELr_.txt’) are produced in a subdirectory that EASY creates within the ‘ExperimentJob’ folder, named ‘/ResultsTimeStampDesc/PrintResults’ (Fig. 2). The /EASY directory is simply where the latest EASY version resides (additional versions in development or legacy versions may also be stored there). Note: The raw data inputs and result outputs for EASY are kept in the ‘ExpJobs’ directory. EASY also outputs a ‘.mat’ file that is stored in the ‘matResults’ folder and is named with the TimeStamp and user-provided name appended to the ‘Results’ folder name when ‘New Experiment’ is executed from the ‘File’ Dropdown menu in the EASY console.
‘EZview’- This directory contains the GUI-enabled MATLAB software to conveniently and efficiently mine the raw cell array image data for a Q-HTCP experiment. It takes the Results.m file (created by EASY software) as an input and permits the user to navigate through the raw image data and growth curve results for the experiment. The /EZview provides a place for storing the the latest EZview version (as well as other EZview versions). EZview provides a GUI for examining the EASY results as provided in the …/matResults/… .mat file.
‘StudiesQHTCP’ - A software composite (MATLAB, JAVA, R, Python, Perl, Shell) that takes growth curve results (created by EASY software) as an input and successively generates interaction Z-score results, which are used for graphing gene interactions, Clustering, Gene Ontology analysis, and other ways of interpreting and visualizing the experimental quality and outcomes. {The /StudiesQHTCP folder contains the ordered command line scripts that call sets of other scripts to perform data selection and adaptation from the extracted text results spreadsheet found in the /ExpJobs/experiment name/Results…/PrintResults/ folder. In particular the ‘user customize interactionCode4experiment.R’ file. It also contains a multitude of R generated plots based on the selected data and possible adaptation. All clustering and Gene ontology analysis are derived from the ‘ZScores_Interaction.csv’ file found in the/ZScores subdirectory.}
‘Master Plates’ - This optional folder is a convenient place to store copies of the ‘MasterPlate_’ and a ‘DrugMedia_’ file templates, along with previously used files that may have been modified and could be reused or further modified to enable future analyses. These two file types are required in the ‘MasterPlateFiles’ folder, which catalogs experimental information specific to individual Jobs in the ExpJobs folder, as described further below.
ExpJobs
EASY
/EASY
/figs
/PTmats
datatipp.m
DgenNoGrowthResults200809.m
DMPexcel2mat\_2023winLinix.m
EASYconsole.fig
EASYconsole.m
NCdisplayGui.m
NCfitImCFparforFailGbl2.m
NCscurImCF\_3parfor.m
NCsingleDisplay.m
NIcircle.m
NImParamRadiusGui.m
NIscanIntensBGpar4GblFnc.m
p4loop8c.m
par4Gbl\_Main8c.m
par4GblFnc8c.m
Open the EASY Software.
Finally, click on the ‘GenReports’ dropdown and select ‘Results_Generate.’
You will first see ‘!!ResultsElr_.txt’ generated in the ‘PrintResults’ folder. Refreshing will reveal an increasing file size until you see the ‘!!ResultsStd_.txt’ being generated. When finished, the ‘!!ResultsStd_.txt’ will be about the same file size and it should be used in the following StudiesQHTCP analysis.
‘NoGrowth_.txt’, and ‘GrowthOnly_.txt’ files will be generated in the ‘PrintResults’ folder.
System for Multi-QHTCP-Experiment Gene Interaction Profiling Analysis
“StudiesQHTCP” is a program that incorporates several command line scripts and provides a directory structure for input and output files.
The analysis system involves Sean Santos’ R code for calculating genetic interaction values and z-scores, clustering of gene interaction z-scores using Recursive Expectation-Maximization clustering (REMc) which relies on WEKA and Java implementation, Go Term Finder (GTF) analyses of the REMc clusters which uses python. Jingu Guo worked on REMc and GTF code and Remy Cron incorporated it into a Java ‘.jar’ file to make it possible to run by multiple users from a shared folder. The executable ‘.jar’ files and all associated Python, Perl, and R scripts are executed via a single master shell script, REMcMaster3.sh. [See section IV.7]
For MacOS: It is recommended that MacOS users download Homebrew for easy installation of the following packages. The command prompt to download Homebrew followed by the prompts to download the necessary packages are listed below.
export HOMEBREW_BREW_GIT_REMOTE=https://github.com/Homebrew/brew
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
sudo cpan File::Map
sudo cpan ExtUtils::PkgConfig
sudo cpan GD
brew install graphiz
brew install gd
sudo cpan GO::TermFinder
brew install pdftk-java
brew install pandoc
**For Linux:** The package manager commands used below are for Debian-based distributions.
If using Fedora or CentOS, you may need to use ‘dnf’ or ‘yum’ in place of ‘apt-get’
sudo cpan File::Map
sudo cpan ExtUtils::PkgConfig
sudo cpan GD
sudo apt-get install graphviz
sudo apt-get install libgd-dev
sudo cpan GO::TermFinder
sudo apt-get install pdftk-java
sudo apt-get install pandoc
For R:
install.packages(“BiocManager”)
BiocManager::install(“org.Sc.sgd.db”)
install.packages(‘ontologyIndex’, dep=TRUE)
install.packages(‘ggrepel’, dep=TRUE)
install.packages(‘tidyverse’, dep=TRUE)
install.packages(‘sos’, dep=TRUE)
install.packages(‘openxlsx’, dep=TRUE)
/StudiesQHTCP
StudiesDataArchive.txt
/ExpStudy (user named)
/A_QHTCP Study Design and Notes
/Code
22_0602_Remy_DAmPsList.txt
All_SGD_GOTerms_for_QHTCPtk.csv
All_SGD_GOTerms.csv
/devStuff
InteractTemplateB4fixes.R
InteractTemplateB4Prompt4SDinput.R
gene_association.sgd
gene_ontology_edit.obo
go_terms.tab
GTAtemplate.R
ORFs_w_DAmP_list.txt
PairwiseLK.R
Parameters.csv
/ScriptTemplates {preserves starting templates of code modified by user}
/BU_Legacy
InteractTemplate.R
Concatenate_GTF_results.py
Concatenate_GTF_resultsB4REMcMaster2.py
GTAtemplate.R
InteractionTemplate230119.R
JoinInteractExps.R
JoinInteractExps3dev.R
PairwiseK_lbl.r
PairwiseL_lbl.R
PairwiseLK.R
Remy_yor_dF_correlation_study.R
TSHeatmaps5dev2.R
SGD_features.tab
SGD_features.tab.txt
/Sscripts
18_0205_heatmaps_zscores_2SD_color_ARem_Z_lm.R
22_0603_Remy_Exlcude_DAmPs.R
cmd_Doxo_SumZScore_Z_lm_Interaction_d...alidationedit.R
cmd_ScoreAllGOTerms_From_Z_lm_V2.R
Compare_GTF_Averages_BetweenScreens_lm_Kvals_v2.R
Compare_GTF_Averages_BetweenScreens_lm_Lvals_v2.R
Compare_GTF_Averages_BetweenScreens_lm_v2.R
GO_list_All_ChildTerms_lmZscore_max100child_Heatmaps_3terms_V2.R
GO_list_All_ChildTerms_lmZscore_max100child_Heatmaps_4terms_aging.R
GO_list_All_ChildTerms_lmZscore_max100child_Heatmaps_4terms_V2.R
GO_list_All_ChildTerms_lmZscore_max100child_Heatmaps_5terms_V2.R
GO_list_All_ChildTerms_lmZscore_max100child_Heatmaps_V2.R
ScoreAllGOTerms_From_Z_lm_V2.R
StudyInfo.csv
TSHeatmaps5dev2.R
/Documentation
**\*\*\*ADD IN SEAN’S MANUAL\*\*\***
Jingyu_REMc_Instruction for clustering and...2013Mar.docx
/LegacyDocs
QHTCP Analysis SystemRev2.docx
QHTCP Analysis SystemRev2a.docx
QHTCP Analysis SystemRev2b.docx
QHTCP Analysis SystemRev2b0.docx
QHTCP Analysis SystemRev2c.docx
/Exp1
/backups
InteractTemplateB4Prompt4SDinput.R
ExpFrontend.m
Z_InteractionTemplate.R
Notes Exp1
/ZScores
/Exp2
/backups
InteractTemplateB4Prompt4SDinput.R
ExpFrontend.m
Z_InteractionTemplate.R
Notes Exp2
/ZScores
/Exp3
/backups
InteractTemplateB4Prompt4SDinput.R
ExpFrontend.m
Z_InteractionTemplate.R
Notes Exp3
/ZScores
/Exp4
/backups
InteractTemplateB4Prompt4SDinput.R
ExpFrontend.m
Z_InteractionTemplate.R
Notes Exp4
/ZScores
/GTAresults
/Exp1
/Exp2
/Exp3
/Exp4
/REMc
AddShiftVals2.R
DconJG2.py
GeneByGOAttributeMatrix_nofiltering-2009Dec07.tab
/GTF
analyze_v2.pl
concatenate_GTF_Results.py
gene_association.sgd
gene_ontology_edi.obd
GOontologyPar.sh
SeanEmailPython2
SGD_features.tab
SGD_features.tab.txt
Terms2tsv_v4.pl
/Component
analyze_v2.pl
concatenate_GTF_Results.py
gene_association.sgd
gene_ontology_edi.obd
ORF_List_DAmPs_Only.txt
ORF_List_Without_DAmPs.txt
ORFs_w_DAmP_list.txt
SGD_features.tab
SGD_features.tab.txt
terms2tsv_v4.pl
/Function
analyze_v2.pl
concatenate_GTF_Results.py
gene_association.sgd
gene_ontology_edi.obd
ORF_List_DAmPs_Only.txt
ORF_List_Without_DAmPs.txt
ORFs_w_DAmP_list.txt
SGD_features.tab
SGD_features.tab.txt
terms2tsv_v4.pl
/Process
analyze_v2.pl
concatenate_GTF_Results.py
gene_association.sgd
gene_ontology_edi.obd
ORF_List_DAmPs_Only.txt
ORF_List_Without_DAmPs.txt
ORFs_w_DAmP_list.txt
SGD_features.tab
SGD_features.tab.txt
terms2tsv_v4.pl
jingyuJava_1_7_extractLib.jar
JoinInteractExps3dev.R
mComponent.sh
mFunction.sh
mProcess.sh
Notes/ REMc, GTF_Ontologies and Associated_Heatmaps
ORF_List_DAmPs_Only.txt
ORF_List_Without_DAmPs.txt
ORFs_w_DAmP_list.txt
/REMcHeatmaps
/REMcHeatmapsWithHomolgy
17_0503_DAmPs_Only.txt
/Homology
REMcHeatmaps_Z_lm_wDAmPs_andHomology_221212.R
Yeast_Human_Homology_Mapping_biomaRt_18_0902.csv
REMcJar2.sh
REMcJar2old.sh
REMcMaster2.sh
REMcMaster3.sh
/TermSpecificHeatmaps
{Note: The TSHeatmaps… .R contains a **Table** section near the start where is a default set of tables. If the user wishes to use different tables, i.e. (All_SGD_GOTerms_for_... .csv) that should be modified and the TSH… . R script relabeled to reflect user modification and that is included in the/Code section. Users should always write notes related to code modifications and study goals-strategies.
/test-DevStuff
Int4DoxGem.R
InteractionTemplate230119cutdown4compareSSV6.R
REMcMaster2Bad.sh
As stated earlier, the user can add folders to back up temporary results, study-related notes, or other related work. However, it is advised to set up and use separate STUDIES when evaluating differing data sets whether that is from experiment results files or from differing data selections in the first interaction … .R script stage. This reduces confusion at the time of the study and especially for those reviewing study analysis in the future.
To begin, consider the goals of the study and design a strategy of experiments to include in the study. Consider the quality of the experiment runs using EZview to see if there are systematic problems that are readily detectable. In some cases, one may wish to design a ‘pilot’ study for discovery purposes. There is no problem doing that, just take a template study, copy and rename it as XYZpilotStudy etc. However, careful examination of the experimental results using EZview will likely save time in the long run. One may be able to relatively quickly run the interaction Z scores (the main challenge there is the user creation of customized interaction… .R code. I have tried to simplify this by locating the user edits near the top).
Preliminary Task
The user specifies the arrangement of the data (in ‘StudyInfo.csv’) by assigning it to /Exp1, /Exp2, /Exp3, or Exp4, which is particularly relevant for clustering as results will be ordered left to right according to experiment number.
A utility (ExpFrontend.m) was made for recording into a spreadsheet (‘StudiesDataArchive.txt’) the date and files used (i.e., directory paths to the !!Results files used as input for Z-interaction script) for each multi-experiment study.
Experiment Specific Interaction Zscores generation
2. In your files directory, open the /Code folder, edit the ‘StudyInfo.csv’ file
3. Open MATLAB and in the application navigate to each specific /Exp folder, call and execute ExpFrontend.m by clicking the play icon. **Use the “Open file” function from within Matlab; do not ‘double click’ on the file from the directory. When prompted, navigate to the ExpJobs folder and the PrintResults folder within the correct job folder. Repeat this for every Exp# folder depending on how many experiments are being performed. The Exp# folder must correspond to the StudyInfo.csv created above.
Note: Before doing this, it’s a good idea to compare the ref and non-ref CPP average and median values. If they are not approximately equal, then may be helpful to standardize Ref values to the measures of central tendency of the Non-refs, because the Ref CPPs are used for the z-scores, which should be centered around zero.
Do this to document the names, dates and paths of all the studies and experiment data used in each study. Note, one should only have a single ‘!!Results…’ file for each /Exp_ to prevent ambiguity and confusion. If you decide to use a new or different ‘!!Results…’ sheet from what was used in a previous “QHTCP Study”, remove the one not being used. NOTE: if you copy a ‘!!Results…’ file in by hand, it will not be recorded in the ‘StudiesDataArchive.txt’ file and so will not be documented for future reference. If you use the ExpFrontend.m utility it will append the new source for the raw !!Results… to the ‘StudiesDataArchive.txt’ file.
As stated above, it is advantageous to think about the comparisons one wishes to make so as to order the experiments in a rational way as it relates to the presentation of plots. That is, which results from sheets and selected ‘interaction … .R’, user modified script, is used in /Exp1, Exp2, Exp3 and Exp4 as explained in the following section.
4. In each /Exp# folder, rename the Z_InteractionTemplate.R script according to the experiment focus
6. Open a terminal, navigate to each /Exp# folder, and execute the (customized) ‘Z_InteractionTemplate_…” script by using the command line below:
Rscript RenamedInteractionTemplate.R \!\!Results… .txt
**need to change wording to choose SD of Delta_Background to exclude Data from analysis.
[1] "Be sure to enter Background noise filter standard deviation i.e., 3 or 5 per Sean"
Enter a Standard Deviation value to noise filter >>
[1] Enter Standard deviation value for removing data for cultures due to high background (e.g., contaminated cultures). Generally set this very high (e.g., ‘20’) on the first run in order NOT to remove data, e.g. ‘20’. Review QC data and inspect raw image data to decide if it is desirable to remove data, and then rerun analysis.
Enter a Background SD threshold for EXCLUDING culture data from further analysis:
The script will request for the user to input a ‘Background Standard Deviation Value’. This Background value removes data where there is high pixel intensity in the background regions of a spot culture (i.e., suspected contamination). 5 is a minimum recommended value, because lower values result in more data being removed, and often times this is undesirable if contamination occurs late after the carrying capacity of the yeast culture is reached. This is most often “trial and error”, meaning there is a ‘Frequency_Delta_Background.pdf’ report in the /Exp_/ZScores/QC/ folder to evaluate whether the chosen value was suitable (and if not the analysis can simply be rerun with a more optimal choice). In general, err on the high side, with BSD of 10 or 12…. One can also use EZview to examine the raw images and individual cultures potentially included/excluded as a consequence of the selected value. Background values are reported in the results sheet and so could also be analyzed there..
(For new terminal users, directory navigation tips are described below)
To navigate to the directory one can use the directory GUI (in X2Go, use the GUI to navigate to desired operating directory and then from the ‘File’ menu, choose “Open in Terminal’)
Alternatively, navigate there through the terminal window: ‘pwd’ “prints the current working directory”, ‘ls’ “lists” the subfolders in the current directory. ‘cd’’ followed by the name of the ‘subdirectory’ will move down into it. “cd .. “ changes to the parent directory
The tab key can be used to autofill unique characters after typing the initial letters of a folder or file you wish to call.
The template structure above assists the user with organization and management of Q-HTCP files and provides a uniform directory structure to streamline reference across different users and experiments.
Since we are systematically comparing perturbations, most Q-HTCP studies will consist of either 2 or 4 experiment subfolders.
The Zscores files are used for subsequent analyses, including REMc, GTA and Term Specific Heatmaps. These further analyses are described below and can be completed in any order and/or concurrently from separate terminals.
**Annotate Files produced and comment out code that produces files that are obsolete or clutter.
REMc
7. Navigate to the /REMc directory and run the following Rscript:
[jwrodger@hartmanlab REMc]$ sh REMcMaster3.sh
The command line will request the user to enter a standard deviation multiplier (factor) that will filter the ZScore data accordingly for use with REMc. That value will also be stored to the StudyInfo.csv file where the user entered descriptive Labels in at the start of this entire QHTCP study. Those labels are used throughout the process on all the graphics that are produced.
The REMcMaster3.sh script will execute the entire process in roughly thirty minutes to possibly an hour. REMcMaster3.sh script tasks are as follows:
**Annotate Files produced and comment out code that produces files that are obsolete or clutter
GTA related work
8. Navigate to the Code directory and open a terminal to run the following Rscript to produce the GTA results for each Exp#:
[jwrodger@hartmanlab Code]$ Rscript GTAtemplate.R
9. Still in the /Code directory, run the following Rscript, entering two Exp# files as input arguments to compare:
[jwrodger@hartmanlab Code]$ Rscript PairwiseLK.R Exp1 Exp2
Term Specific Heatmaps Production
10. Navigate to the /Code directory and run the following Rscript to produce the Term Specific Heatmaps:
[jwrodger@hartmanlab Code]$ Rscript TSHeatmaps5dev2.R
**Naming of ‘StudiesQHTCP/Study/output files. The resulting files produced in StudiesQHTCP folders have standard file names, which will be the same initially, across all studies. However, when the analysis is complete, and it may be desirable to move some of the results files outside of their native directories, and therefore useful to give them unique and recognizable names. Descriptive names can be added to all files by running two scripts from a terminal after navigating to the corresponding code directory:
i. “sh RenameZscores_GTAresults.sh” will add names provided in ‘StudyInfo.csv’ to files in the ‘Zscores’ subdirectory of the respective ‘Exp’ folder and to files in the ‘GTAresults’ folder.
ii. “sh RenameREMcHtmaps_GTFfiles.sh” will append the label given by the user when prompted to files in the ‘REMc’ and ‘TermSpecificHeatmaps’ folders.
https://weka.sourceforge.io/doc.dev/weka/clusterers/RandomizableDensityBasedClusterer.html
#setSeed-int- the above link is relevant to how REMc results are always the same (presumably because seed selection is non-random).
Questions to address / notes to incorporate here or elsewhere:
We need full documentation for all of the current workflow. There are different documents that need to be integrated. This will need to be updated as we make improvements to the system.
In Easy -
MasterPlate_ file must have ydl227c in orf column, or else it Z_interaction.R will fail, because it can’t calculate shift values.
Make sure there are no special characters; e.g., (), “, ‘, ?, etc.; dash and underscore are ok as delimiters
Drug_Media_ file must have letter character to be read as ‘text’.
MasterPlate_ file and DrugMedia_ are .xlsx or .xls, but !!Results_ is .txt.
In Z_interactions.R, does it require a zero concentration/perturbation (should we use zero for the low conc, even if it’s not zero), e.g., in order to do the shift correctly.
Need to enable all file types (not only .xls) as the default for GenerateResults (to select MP and DM files as .xlsx).
Explore differences between the ELR and STD files - 24_0414; John R modified Z script to format ELR file for Z_interactions.R analysis.
To keep time stamps when transferring with FileZilla, go to the transfer drop down and turn it on, see https://filezillapro.com/docs/v3/advanced/preserve-timestamps/
Could we change the ‘MasterPlateFiles’ folder label in EASY to ‘MasterPlate_DrugMedia’ (since there should be only one MP and there is also a DM file required?
I was also thinking of adding a ‘MasterPlateFilesOnly’ folder to the QHTCP directory template where one could house different MPFiles (e.g., with and without damps, with and without Refs on all MPs, etc; other custom MPFiles, updated versions, etc)
Currently updated files are in ‘23_1011_NewUpdatedMasterPlate_Files’ on Mac (yeast strains/23_0914…/)
For EASY to report cell array positions (plate_row_column) to facilitate analyzing plate artifacts. The MP File in Col 3 is called ‘LibraryLocation’ and is reported after ‘Specifics’ in the !!Results.
Can EASY/StudiesQ-HTCP be updated at any time by rerunning with updated MP file (new information for gene, desc, etc)- or maybe better to always start with a new template?
Need to be aware of file formatting to avoid dates (e.g., with gene names like MAY24, OCT1, etc, and with plate locations 1E1, 1E2, etc)- this has been less of a problem.
In StudiesQHTCP folders, remember to annotate Exp1, Exp2, in the StudyInfo.csv file.
Where are gene names called from for labeling REMc heatmaps, TSHeatmaps, Z-interaction graphs, etc? Is this file in the QHTCP ‘code’ folder, or is it in the the results file (and thus ultimately the MP file)?
Is it ok for a MasterPlate_ file to have multiple sheets (e.g., readme tab- is only the first tab read in)?
What are the rules for pulling information from the MasterPlateFile to the !!Results_ (e.g., is it the column or the Header Name, etc that is searched? Particular cells in the DrugMedia file?).
Modifier, Conc are from DM sheet, and refer to the agar media arrays. OrfRep is from MasterPlate_ File. ‘Specifics’ (Last Column) is experiment specific and accommodate designs involving differences across the multi-well liquid arrays. ‘StrainBkGrd’ (now ‘Library location’) is in the 3rd column and reported after ‘Specifics’ at the last col of the ‘!!Results..’ file.
Do we have / could we make an indicator- work in progress or idle/complete with MP/DM and after gen-report. Now, we can check for the MPDMmat.mat file, or we can look in PrintResults, but would be nice to know without looking there.
File>>Load Experiment wasn’t working (no popup to redirect). Check this again.
In EZview:
What do the File, Parameters and Tools dropdown menu items do?
What is the ‘Hide’ button for?
What is the ‘composite’ overlay good for?
What is the file that is used for the ‘Info’ function above the Gene Directory.
how to wand over labels - how does that work in matlab?
In StudiesQHTCP:
For front end, be more specific about where to navigate to find results file.
ELR type file errors out - needs to be produced in a compatible format.
**change wording to “choose SD for Delta_Background to exclude spot culture growth curve data from interaction analysis”.
GTF:
Limit to smaller terms.
Enable sort by term size.
There needs to be an annotated set of MasterPlate File templates. These could be numbered and annotated chronologically, and each experiment could specify which instance of the MP file template is used. When possible/ if necessary, the folders of plate images should be reordered rather than reordering the MP file. Each Exp, should have an ExpDesc spreadsheet in it indicating the Exp Design (summarizing what is expected in the !!Results file), based on the ‘MasterPlate_’ and ‘DrugMedia_’ files.
Need to add Ref to Blank positions in the new library construction.
In EZview:
John R made a version that runs the original Guide, the ‘exported’, or the ‘migrated’ Forms of the program. A variety of versions didn’t work very well. The original program, with some improvements, seems to work the best. We should try to optimize it.
{AppDesigner
/mnt/data/EZview/EZview2023/EZviewDev23_0921POSadaptedOnM4800_wlapp
Use EZvStartup which calls EZviewGui_7.mlapp
GUIDE
/mnt/data/EZview/EZview2023/EZviewDev23_0919POScleanup4Pub_MigrationWorkingFileExport_wlapp
Use the standard EZviewGui.m to start execution}
Update from John R:
“There are four EZstartup ----.m files.
EZvStartup.m -Guide version
EZvStartup_Export.m -Exported file migration version
EZvStartup_mlappLaptop.m -M4800 sized Laptop version
EZvStartup_mlappServer.m -Server sized
You can try them out at your convenience.
I have obviously not tried them out on a Mac Laptop.
Extra files etc. still there. It's a hack and chop job but it seems to work.
Location:
/mnt/data/EZview/EZview2023/EZviewDev23_1004POSadaptedOnM4800_wlapp”
Suggestions to improve EZview appearance:
Fix heatmap dimensions to match image dimensions. Is it possible to enable the user to adjust the heatmap dimensions (e.g. by dragging edges or corners to resize its window)?
Have an option to use a fixed heatmap scale across an experiment.
Check chronological experiments.
What does “SpotView” button do?
Can we add scrollbar to RFtab popup window so that it can be resized without losing view of table?
For StudiesQ-HTCP, For GTF, we need to make sure that the correct ORFpool (e.g., with or without damps) is being used. Can that be a selection step in the code, or an additional step to include it in the code (try, etc).
The Library Locations for E rows in the MasterPlateFiles are being converted to exponentials in the !!Results files- needs to be text. One idea is to convert to text and/or use a delimiter. If they have a delimiter, perhaps prefix of ‘mp’ (converting to text) is not needed? e.g., ‘1_E1’ instead of ‘mp1E1’?
For TermSpecificHeatMaps, which list of GOterms did we use (see Sean’s manual, p. 14; ‘1.3 Term Specific Heatmaps’). Maybe we need a shorter, or more dedicated list.
Compare our GTF to that of YeastMine to check for ‘correctness' of updated files?
In Studies QHTCP new template, edit the StudyInfo.csv on the server in Libre, but leave it as a single column (or choose to open it with comma delimiter) and edit between the commas. But don’t convert in Excel (text to columns >> resave), since this deprecates the .csv format and code won’t run anymore and gives a data frame error after the STDEV for background.
**Consider updating Z_InteractionTemplate.R in the Studies_QHTCP template folder if modifications are made for a particular study that could be useful for additional future studies. The idea would be to comment out the study-specific modifications and overwrite the existing program.
StudiesQHTCP:
RF z-interaction plots don’t include RF2; RF1 only?
We may want to set different z-score cutoffs, based on the shape of the rank plot.
We want to calculate mean and median CPPs for Refs and Non-Ref cultures. May want to adjust REF data so that the median CPP values for Ref cultures are same or close to that to the Non-Ref cultures.
We should regress through the origin for the z-score interaction fitting.
Define NG(no growth), DB(?) and SM(?) on InteractionPlots.
In MPfile_templates, replace all YKL227C with YDL227C (120 instances); may only be in the file with Refs added to MPs.
Update gene by go matrix in REMc (from Dec07 2009) folder of StudiesQHCP.
The ‘FrontEnd’ popup message when it is played should say “Select the !!Results File (in ‘ExpJobs’ folder)” to avoid confusion / remind about QHTCP structure.
In REMcMaster3.sh, change prompt to ask for Z-score value (not standard deviation) for filtering analysis.
Can REMc cluster Aniyia’s data (extract names in place of gene names) does it fail because it can’t do GTF, even though it should be able to simply cluster the Int_Z-scores and label heatmaps (i.e., do this without doing GTF).
Also, provide more detailed message when prompted to enter background level in Z_IntR.
Output Z_scoreInt file in same order as InteractionPlots.
Check all folders in template for updated files; e.g., not just the ‘code’ folder, but also Exp1/2/3/4, REMc and GTF, etc..
For EASY, need notification for successful completion after selecting drug media file with prompt ‘labeling complete, you may now generate report’.
REMc error (BMHonly run):
REMc- include LibraryLocation info in FinalTable.
This would come from looking up ‘OrfReplicate’ column in ‘MP File’, not ORF name column.
Useful to have demarcator between MP and WellPos, e.g., mp8_B24 so that clusters can be analyzed for MP artifacts (more concise preferred format is 8_B24).
End of GTA in terminal:
Full GTA result BMHonly:
The one pdf that is made can’t be opened.
When we leave out the DAmPs, we should probably still do the 2nd REF plate.
I go the same result by running the REMcMaster3.sh whether I used Zscore cutoff of 2 or 1 (2PEonly Experiment). Note: I reran the script in the same folder.
Need new strategy to check for plate artifacts - calculate REF averages and Medians and compare to non-REF averages and Medians. If need to do corrections, correct by median since these are less impacted by outlier/tails of distributions.
Heatmaps change to incorrect ones when the print heatmap function is used.
Don’t we need a program to remove the template files and keep only the results after StudiesQHTCP.
What is this step - what package do we need to openxlsx- what failed as a result of not having it?
Appendix
Notes on the standard EASY coding structure (in Matlab):
For PinTool Functionality, the following scripts must be in the EASY folder:
- NdirectPTGui.m
-NIPTdirectParmsGui.m
-NIPTsearchParmsGui.m
-NImapPT.m
-NImapPTcentA.m
-NImapPTcentroidSrc.m
-NImapPTcentroidSrcCirc.m
Other code Notes:
>The Toolbar for the EASYconsole is found in line 61 of EASYconsole.m >> ‘figure’ \= on; ‘none’ \= off
>PlateMapPintool button functionality: line 345 in EASYconsole.m