Ringtail package
Submodules
ringtail.cloptionparser module
- class ringtail.cloptionparser.CLOptionParser
Bases:
object
Command line option/argument parser. Options and switches are utilized in the script ‘rt_process_vs.py’.
- process_mode
operating in ‘write’ or ‘read’ mode
- Type:
str
- rtcore
ringtail core object initialized with the provided db_file
- Type:
- filters
fully parsed and organied optional filters
- Type:
dict
- file_sources
fully parsed docking results and receptor files
- Type:
dict
- writeopts
fully parsed arguments related to database writing
- Type:
dict
- storageopts
fully parsed arguments related to how the storage system behaves
- Type:
dict
- outputopts
fully parsed arguments related to output and reading from the database
- Type:
dict
- print_summary
switch to print database summary
- Type:
bool
- filtering
switch to run filtering method
- Type:
bool
- plot
switch to plot the data
- Type:
bool
- export_bookmark_db
switch to export bookmark as a new database
- Type:
bool
- export_receptor
switch to export receptor information to pdbqt
- Type:
bool
- pymol
switch to visualize ligands in pymol
- Type:
bool
- data_from_bookmark
switch to write bookmark data to the output log file
- Type:
bool
- Raises:
OptionError – Error when an option cannot be parsed correctly
- process_options(parsed_opts)
Process and organize command line options to into ringtail options and filter dictionaries and ringtail core attributes
- Parameters:
parsed_opts (argparse.Namespace) – arguments provided through the cmdline_parser method.
- ringtail.cloptionparser.cmdline_parser(defaults: dict = {})
Parses options provided using the command line. All arguments are first populated with default values. If a config file is provided, these will overwrite default values. Any single arguments provided using the argument parser will overwrite default and config file values.
- Parameters:
defaults (dict) – default argument values
ringtail.exceptions module
- exception ringtail.exceptions.DatabaseConnectionError
Bases:
StorageError
- exception ringtail.exceptions.DatabaseInsertionError
Bases:
StorageError
- exception ringtail.exceptions.DatabaseQueryError
Bases:
StorageError
- exception ringtail.exceptions.DatabaseTableCreationError
Bases:
StorageError
- exception ringtail.exceptions.DatabaseViewCreationError
Bases:
StorageError
- exception ringtail.exceptions.FileParsingError
Bases:
Exception
- exception ringtail.exceptions.MultiprocessingError
Bases:
Exception
- exception ringtail.exceptions.NoInputError
Bases:
OptionError
- exception ringtail.exceptions.OptionError
Bases:
Exception
- exception ringtail.exceptions.OutputError
Bases:
Exception
- exception ringtail.exceptions.RTCoreError
Bases:
Exception
- exception ringtail.exceptions.ResultsProcessingError
Bases:
Exception
- exception ringtail.exceptions.StorageError
Bases:
Exception
- exception ringtail.exceptions.WriteToStorageError
Bases:
Exception
ringtail.interactions module
- class ringtail.interactions.InteractionFinder(rec_string, interaction_cutoff_radii)
Bases:
object
Class for handling and calculating ligand-receptor interactions.
- rec_string
string describing the receptor
- Type:
str
- interaction_cutoff_radii
cutoff for interactions of hydrogen bonds and VDW interactions, in ångströms
- Type:
list(float)
- find_pose_interactions(lig_atomtype_list: list, lig_coordinates: list) dict
Method that identifies interactions for a pose within th given cutoff distances in the main class.
- Parameters:
lig_atomtype_list (list) – list of atoms in the ligand
lig_coordinates (list) – coordinates for the atoms in the ligand
- Returns:
all interaction details for a given ligand pose
- Return type:
dict
ringtail.mpmanager module
- class ringtail.mpmanager.MPManager(docking_mode, max_poses, interaction_tolerance, store_all_poses, add_interactions, interaction_cutoffs, max_proc, storageman, storageman_class, chunk_size, target, receptor_file, file_pattern=None, file_sources=None, string_sources=None)
Bases:
object
Manager that orchestrates paralell processing of docking results data, using one of the supported multiprocessors.
- docking_mode
describes what docking engine was used to produce the results
- Type:
str
- max_poses
max number of poses to store for each ligand
- Type:
int
- interaction_tolerance
Will add the interactions for poses within some tolerance RMSD range of the top pose in a cluster to that top pose.”
- Type:
float
- store_all_poses
Store all poses from docking results
- Type:
bool
- add_interactions
find and save interactions between ligand poses and receptor
- Type:
bool
- interaction_cutoffs
cutoff for interactions of hydrogen bonds and VDW interactions, in ångströms
- Type:
list(float)
- max_proc
Maximum number of processes to create during parallel file parsing.
- Type:
int
- storageman
storageman object
- Type:
- storageman_class
storagemanager child class/database type
- Type:
- chunk_size
how many tasks ot send to a processor at the time
- Type:
int
- target
name of receptor
- Type:
str
- receptor_file
file path to receptor
- Type:
str
- file_pattern
file pattern to look for if recursively finding results files to process
- Type:
str, optional
- file_sources
RingtailOption object that holds all attributes related to results files
- Type:
InputFiles, optional
- string_sources
RingtailOption object that holds all attributes related to results strings
- Type:
InputStrings, optional
- num_files
number of files processed at any given time
- Type:
int
- process_results()
Processes results data (files or string sources) by adding them to the queue and starting their processing in multiprocess.
ringtail.mpreaderwriter module
- class ringtail.mpreaderwriter.DockingFileReader(*args: Any, **kwargs: Any)
Bases:
Process
This class is the individual worker for processing docking results. One instance of this class is instantiated for each available processor.
- queueIn
current queue for the processor/file reader
- Type:
multiprocess.Queue
- queueOut
queue for the processor/file reader after adding or removing an item
- Type:
multiprocess.Queue
- pipe_conn
pipe connection to the reader
- Type:
multiprocess.Pipe
- storageman
storageman object
- Type:
- storageman_class
storagemanager child class/database type
- Type:
- docking_mode
describes what docking engine was used to produce the results
- Type:
str
- max_poses
max number of poses to store for each ligand
- Type:
int
- interaction_tolerance
Will add the interactions for poses within some tolerance RMSD range of the top pose in a cluster to that top pose.”
- Type:
float
- store_all_poses
Store all poses from docking results
- Type:
bool
- add_interactions
find and save interactions between ligand poses and receptor
- Type:
bool
- interaction_cutoffs
cutoff for interactions of hydrogen bonds and VDW interactions, in ångströms
- Type:
list(float)
- target
receptor name
- Type:
str
- run()
Method overload from parent class .This is where the task of this class is performed. Each multiprocess.Process class must have a “run” method which is called by the initialization (see below) with start()
- Raises:
NotImplementedError – if parser for specific docking result type is not implemented
- class ringtail.mpreaderwriter.Writer(*args: Any, **kwargs: Any)
Bases:
Process
This class is a listener that retrieves data from the queue and writes it into datbase
- process_data(data_packet)
Breaks up the data in the data_packet to distribute between the different arrays to be inserted in the database.
- Parameters:
data_packet (any) – File packet to be processed
- run()
Method overload from parent class. This is where the task of this class is performed. Each multiprocess.Process class must have a “run” method which is called by the initialization (see below) with start()
- Raises:
- write_to_storage()
Inserting data to the database through the designated storagemanager.
ringtail.outputmanager module
- class ringtail.outputmanager.OutputManager(log_file=None, export_sdf_path=None)
Bases:
object
Class for creating outputs, can be a context manager to handle log files
- log_file
name for log file
- Type:
str
- export_sdf_path
path for exporting SDF molecule files
- Type:
str
- _log_open
if log file is open or not
- Type:
bool
- close_logfile()
Closes the log file properly and reset file pointer to filename
- log_num_passing_ligands(number_passing_ligands: int)
Write the number of ligands which pass given filter to log file
- Parameters:
number_passing_ligands (int) – number of ligands that passed filter
- Raises:
- open_logfile(write_filters_header=True)
Opens log file and creates it if needed
- Parameters:
write_filters_header (bool) – only used because one method does not take the same headers
- Raises:
- plot_all_data(xdata, ydata, num_of_bins: int = 100)
Takes dictionary of binned data where key is the coordinates of the bin and value is the number of points in that bin. Adds to scatter plot colored by value
- Parameters:
xdata (list) – list of x axis data (needs to be same length as ydata)
ydata (list) – list of y axis data (needs to be same length as xdata)
num_of_bins (int) – number of bins to organize data in
- Returns:
matplotlib.pyplot.figure
- Raises:
- plot_single_points(x: list, y: list, markersize: int = 20, color='crimson')
Add points to scatter plot with given x and y coordinates and color.
- Parameters:
x (float) – x coordinate
y (float) – y coordinate
color (str, optional) – Color for point. Default black.
- Raises:
- save_scatterplot()
Saves current figure as scatter.png
- Raises:
- scatter_hist(x, y, z, ax_histx, ax_histy)
Makes scatterplot with a histogram on each axis
- Parameters:
x (list) – x coordinates for data
y (list) – y coordinates for data
z (list) – z coordinates for data
ax (matplotlib.axis) – scatterplot axis
ax_histx (matplotlib.axis) – x histogram axis
ax_histy (matplotlib.axis) – y histogram axis
- Raises:
- write_filter_log(lines)
Writes lines from results iterable into log file
- Parameters:
lines (iterable) – Iterable with tuples of data for writing into log
- Raises:
- Returns:
number of ligands passing that are written to log file
- Return type:
int
- write_filters_to_log(filters_dict, included_interactions, additional_info='')
Takes dictionary of filters, formats as string and writes to log file
- Parameters:
filters_dict (dict) – dictionary with filtering options
included_interactions (list) – types of interactions to include in the filtering
additional_info (str) – any additional information to write to top of log file
- Raises:
- write_find_similar_header(query_ligname, cluster_name)
Properly formats header for the log file find_similar_ligands
- write_maxmiss_union_header()
Properly formats header for the log file if using max_miss and enumerate_interaction_combs
- write_out_mol(filename, mol, flexres_mols, properties)
Writes out given mol as sdf. Will create the specified sdf folder in current working directory if needed.
- Parameters:
filename (str) – name of SDF file that will be written to
mol (RDKit.Chem.Mol) – RDKit molobject to be written to SDF
flexres_mols (list) – dictionary of rdkit molecules for flexible residues
properties (dict) – dictionary of list of properties to add to mol before writing
- Raises:
- write_receptor_pdbqt(recname: str, receptor_compbytes)
Writes a pdbqt file from receptor “blob”
- Parameters:
recname (str) – name of receptor to use in output filename
receptor_compbytes (blob) – receptor blob
- write_results_bookmark_to_log(bookmark_name)
Write the name of the result bookmark into log
- Parameters:
bookmark_name (str) – name of current results’ bookmark in db
- Raises:
ringtail.parsers module
- ringtail.parsers.parse_single_dlg(fname)
Parse an ADGPU DLG file uncompressed or gzipped
- Parameters:
fname (str) – ligand docking result file name
- Raises:
ValueError –
- Returns:
parsed results ready to be inserted in database
- Return type:
dict
- ringtail.parsers.parse_vina_result(data_pointer) dict
Parser for vina docking results, supporting either pdbqt or gzipped (.gz) files, or with the docking results provided as a string.
- Parameters:
data_pointer (any) – either filename or dictionary of string docking results
- Returns:
parsed results ready to be inserted in database
- Return type:
dict
- ringtail.parsers.receptor_pdbqt_parser(fname)
Parse receptor PDBQT file into list of dictionary with dictionary containing data for a single atom line
- Parameters:
fname (string) – name of receptor pdbqt file to parse
ringtail.receptormanager module
- class ringtail.receptormanager.ReceptorManager
Bases:
object
Class with methods dealing with formatting of receptor information
- static blob2str(receptor_blob)
Creates blob of compresser receptor file info
- Parameters:
receptor_blob (blob) – zipped receptor blob
- Returns:
receptor string
- Return type:
str
- static make_receptor_blobs(file_list)
Creates compressed receptor info
- Parameters:
file_list (str) – path to receptor file
- Returns:
compressed receptor
- Return type:
blob
ringtail.resultsmanager module
- class ringtail.resultsmanager.ResultsManager(docking_mode: str = None, max_poses: int = None, interaction_tolerance: float = None, store_all_poses: bool = None, add_interactions: bool = None, interaction_cutoffs: list = None, max_proc: int = None, storageman: StorageManager = None, storageman_class: StorageManager = None, chunk_size: int = 1, parser_manager: str = 'multiprocess', file_sources=None, string_sources=None)
Bases:
object
Class that handles the processing of the results, including passing on the docking results to the appropriate paralell/multi-processing unit
- Parameters:
max_poses (int) – max number of poses to store for each ligand
interaction_tolerance (float) – Will add the interactions for poses within some tolerance RMSD range of the top pose in a cluster to that top pose.”
store_all_poses (bool) – Store all poses from docking results
add_interactions (bool) – find and save interactions between ligand poses and receptor
interaction_cutoffs (list(float)) – cutoff for interactions of hydrogen bonds and VDW interactions, in ångströms
max_proc (int) – Maximum number of processes to create during parallel file parsing.
storageman (StorageManager) – storageman object
storageman_class (StorageManager) – storagemanager child class/database type
chunk_size (int) – how many tasks ot send to a processor at the time
parser_manager (str, optional) – what paralellization or multiprocessing package to use
file_sources (InputFiles, optional) – given file sources including the receptor file
string_sources (InputStrings, optional) – given string sources including the path to the receptor
- Raises:
- process_docking_data()
Processes docking data in the form of files or strings
- Raises:
ResultsProcessingError – if no file or string sources are provided, or if both are provided
ringtail.ringtailcore module
- class ringtail.ringtailcore.RingtailCore(db_file: str = 'output.db', storage_type: str = 'sqlite', docking_mode: str = 'dlg', logging_level: str = 'WARNING')
Bases:
object
Core class for coordinating different actions on virtual screening including adding results to storage, filtering and clusteirng, and outputting data as rdkit molecules, plotting docking results, and visualizing select ligands in pymol.
- db_file
name of database file being operated on
- Type:
str
- docking_mode
specifies what docking mode has been used for the results in the database
- Type:
str
- storageman
Interface module with database
- Type:
- resultsman
Module to deal with results processing before adding to database
- Type:
- outputman
Manager for output tasks of log-writting, plotting, ligand SDF writing, starting pymol sessions
- Type:
- _run_mode
refers to whether ringtail is ran from the command line or through direct API use, where the former is more restrictive
- Type:
str
- add_results_from_files(file: str = None, file_path: str = None, file_list: str = None, file_pattern: str = None, recursive: bool = None, receptor_file: str = None, save_receptor: bool = None, filesources_dict: dict = None, duplicate_handling: str = None, overwrite: bool = None, store_all_poses: bool = None, max_poses: int = None, add_interactions: bool = None, interaction_tolerance: float = None, interaction_cutoffs: list = None, max_proc: int = None, options_dict: dict = None, finalize: bool = True)
Call storage manager to process result files and add to database. Creates or adds to an existing a database. Options can be provided as a dict or as individual options. If both are provided, individual options will overwrite those from the dictionary.
- Parameters:
(str (file_list) – list(str)): ligand result file
optional – list(str)): ligand result file
(str – list(str)): list of folders containing one or more result files
optional – list(str)): list of folders containing one or more result files
(str – list(str)): list of ligand result file(s)
optional – list(str)): list of ligand result file(s)
file_pattern (str) – file pattern to use with recursive search in a file_path, “.dlg” for AutoDock-GDP and “.pdbqt” for vina
recursive (bool) – used to recursively search file_path for folders inside folders
receptor_file (str) – string containing the receptor .pdbqt
save_receptor (bool) – whether or not to store the full receptor details in the database (needed for some things)
filesources_dict (dict) – file sources already as an object
duplicate_handling (str, options) – specify how duplicate Results rows should be handled when inserting into database. Options are “ignore” or “replace”. Default behavior will allow duplicate entries.
store_all_poses (bool) – store all ligand poses, does it take precedence over max poses?
max_poses (int) – how many poses to save (ordered by soem score?)
add_interactions (bool) – add ligand-receptor interaction data, only in vina mode
interaction_tolerance (float) – longest ångström distance that is considered interaction?
interaction_cutoffs (list) – ångström distance cutoffs for x and y interaction
max_proc (int) – max number of computer processors to use for file reading
options_dict (dict) – write options as a dict
- Raises:
- add_results_from_vina_string(results_strings: dict = None, receptor_file: str = None, save_receptor: bool = None, resultsources_dict: dict = None, duplicate_handling: str = None, overwrite: bool = None, store_all_poses: bool = None, max_poses: int = None, add_interactions: bool = None, interaction_cutoffs: list = None, max_proc: int = None, options_dict: dict = None, finalize: bool = True)
Call storage manager to process the given vina output string and add to database. Options can be provided as a dict or as individual options. Creates or adds to an existing a database.
- Parameters:
results_string (dict) – string containing the ligand identified and docking results as a dictionary
receptor_file (str) – string containing the receptor .pdbqt
save_receptor (bool) – whether or not to store the full receptor details in the database (needed for some things)
resultsources_dict (dict) – file sources already as an object
duplicate_handling (str, options) – specify how duplicate Results rows should be handled when inserting into database. Options are “ignore” or “replace”. Default behavior will allow duplicate entries.
store_all_poses (bool) – store all ligand poses, does it take precedence over max poses?
max_poses (int) – how many poses to save (ordered by soem score?)
add_interactions (bool) – add ligand-receptor interaction data, only in vina mode
interaction_cutoffs (list) – ångström distance cutoffs for x and y interaction
max_proc (int) – max number of computer processors to use for file reading
options_dict (dict) – write options as a dict
- Raises:
- static default_dict() dict
Creates a dict of all Ringtail options.
- Returns:
json string with options
- Return type:
str
- display_pymol(bookmark_name=None)
Launch pymol session and plot of LE vs docking score. Displays molecules when clicked.
- Parameters:
bookmark_name (str) – bookmark name to use in pymol. ‘None’ uses the whole db?
- property docking_mode
Private method to retrieve docking mode
- Returns:
docking mode
- Return type:
str
- drop_bookmark(bookmark_name: str)
Drops specified bookmark from the database
- Parameters:
bookmark_name (str) – name of bookmark to be dropped.
- export_bookmark_db(bookmark_name: str = None) str
Export database containing data from bookmark
- Parameters:
bookmark_name (str) – name for bookmark_db
- Returns:
name of the new, exported database
- Return type:
str
- export_csv(requested_data: str, csv_name: str, table=False)
Get requested data from database, export as CSV
- Parameters:
requested_data (str) – Table name or SQL-formatted query
csv_name (str) – Name for exported CSV file
table (bool) – flag indicating is requested data is a table name
- export_receptors()
Export receptor in database to pdbqt
- filter(eworst=None, ebest=None, leworst=None, lebest=None, score_percentile=None, le_percentile=None, vdw_interactions=None, hb_interactions=None, reactive_interactions=None, hb_count=None, react_any=None, max_miss=None, ligand_name=None, ligand_operator=None, ligand_substruct=None, ligand_substruct_pos=None, ligand_max_atoms=None, filters_dict: dict | None = None, enumerate_interaction_combs: bool = False, output_all_poses: bool = None, mfpt_cluster: float = None, interaction_cluster: float = None, log_file: str = None, overwrite: bool = None, order_results: str = None, outfields: str = None, bookmark_name: str = None, filter_bookmark: str = None, options_dict: dict | None = None, return_iter=False)
Prepare list of filters, then hand it off to storageman to perform filtering. Creates log of all ligand docking results that passes.
- Parameters:
Filters – eworst (float): specify the worst energy value accepted ebest (float): specify the best energy value accepted leworst (float): specify the worst ligand efficiency value accepted lebest (float): specify the best ligand efficiency value accepted score_percentile (float): specify the worst energy percentile accepted. Express as percentage e.g. 1 for top 1 percent. le_percentile (float): specify the worst ligand efficiency percentile accepted. Express as percentage e.g. 1 for top 1 percent. vdw_interactions (list[tuple]): define van der Waals interactions with residue as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [(‘A:VAL:279:’, True), (‘A:LYS:162:’, True)] -> [(‘chain:resname:resid:atomname’, <wanted (bool)>), (‘chain:resname:resid:atomname’, <wanted (bool)>)] hb_interactions (list[tuple]): define HB (ligand acceptor or donor) interaction as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [(‘A:VAL:279:’, True), (‘A:LYS:162:’, True)] -> [(‘chain:resname:resid:atomname’, <wanted (bool)>), (‘chain:resname:resid:atomname’, <wanted (bool)>)] reactive_interactions (list[tuple]): check if ligand reacted with specified residue as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [(‘A:VAL:279:’, True), (‘A:LYS:162:’, True)] -> [(‘chain:resname:resid:atomname’, <wanted (bool)>), (‘chain:resname:resid:atomname’, <wanted (bool)>)] hb_count (list[tuple]): accept ligands with at least the requested number of HB interactions. If a negative number is provided, then accept ligands with no more than the requested number of interactions. E.g., [(‘hb_count’, 5)] react_any (bool): check if ligand reacted with any residue max_miss (int): Will compute all possible combinations of interaction filters excluding up to max_miss numer of interactions from given set. Default will only return union of poses interaction filter combinations. Use with ‘enumerate_interaction_combs’ for enumeration of poses passing each individual combination of interaction filters. ligand_name (list[str]): specify ligand name(s). Will combine name filters with OR, e.g., [[“lig1”, “lig2”]] ligand_substruct (list[str]): SMARTS, index of atom in SMARTS, cutoff dist, and target XYZ coords, e.g., [[“ccc”, “CN”]] ligand_substruct_pos (list[list[type]]): SMARTS pattern(s) for substructure matching, e.g., [[“[Oh]C”, 0, 1.2, -5.5, 10.0, 15.5]] -> [[“smart_string”, index_of_positioned_atom, cutoff_distance, x, y, z]] ligand_max_atoms (int): Maximum number of heavy atoms a ligand may have ligand_operator (str): logical join operator for multiple SMARTS (default: OR), either AND or OR filters_dict (dict): provide filters as a dictionary
options (Ligand results) –
enumerate_interaction_combs (bool): When used with max_miss > 0, will log ligands/poses passing each separate interaction filter combination as well as union of combinations. Can significantly increase runtime. output_all_poses (bool): By default, will output only top-scoring pose passing filters per ligand. This flag will cause each pose passing the filters to be logged. mfpt_cluster (float): Cluster filtered ligands by Tanimoto distance of Morgan fingerprints with Butina clustering and output ligand with lowest ligand efficiency from each cluster. Default clustering cutoff is 0.5. Useful for selecting chemically dissimilar ligands. interaction_cluster (float): Cluster filtered ligands by Tanimoto distance of interaction fingerprints with Butina clustering and output ligand with lowest ligand efficiency from each cluster. Default clustering cutoff is 0.5. Useful for enhancing selection of ligands with diverse interactions. log_file (str): by default, results are saved in output_log.txt; if this option is used, ligands and requested info passing the filters will be written to specified file overwrite (bool): by default, if a log file exists, it doesn’t get overwritten and an error is returned; this option enable overwriting existing log files. Will also overwrite existing database order_results (str): Stipulates how to order the results when written to the log file. By default will be ordered by order results were added to the database. ONLY TAKES ONE OPTION. Available fields are:
”e” (docking_score), “le” (ligand efficiency), “delta” (delta energy from best pose), “ref_rmsd” (RMSD to reference pose), “e_inter” (intermolecular energy), “e_vdw” (van der waals energy), “e_elec” (electrostatic energy), “e_intra” (intermolecular energy), “n_interact” (number of interactions), “rank” (rank of ligand pose), “run” (run number for ligand pose), “hb” (hydrogen bonds);
outfields (str): defines which fields are used when reporting the results (to stdout and to the log file); fields are specified as comma-separated values, e.g. –outfields=e,le,hb; by default, docking_score (energy) and ligand name are reported; ligand always reported in first column available fields are:
”Ligand_name” (Ligand name), “e” (docking_score), “le” (ligand efficiency), “delta” (delta energy from best pose), “ref_rmsd” (RMSD to reference pose), “e_inter” (intermolecular energy), “e_vdw” (van der waals energy), “e_elec” (electrostatic energy), “e_intra” (intermolecular energy), “n_interact” (number of iteractions), “ligand_smile” , “rank” (rank of ligand pose), “run” (run number for ligand pose), “hb” (hydrogen bonds), “receptor” (receptor name)
bookmark_name (str): name for resulting book mark file. Default value is ‘passing_results’ filter_bookmark (str): name of bookmark to perform filtering over options_dict (dict): write options as a dict return_inter (bool): return an iterable of all of the filtering results
- Returns:
number of ligands passing filter iter (optional): an iterable of all of the filtering results
- Return type:
int
- finalize_write()
Finalize database write by creating interaction tables and setting database version
- find_similar_ligands(query_ligname: str)
Find ligands in cluster with query_ligname
- Parameters:
query_ligname (str) – name of the ligand in the ligand table to look for similars to
- Returns:
number of ligands that are similar
- Return type:
int
- static generate_config_file_template()
Outputs to “config.json in current working directory if to_file = true, else it returns the dict of default option values used for API (for command line a few more options are included that are always used explicitly when using API)
- Parameters:
to_file (bool) – whether to produce the template as a json string or as a file “config.json”
- Returns:
file name of config file or json string with template including default values
- Return type:
str
- get_bookmark_names()
Method to retrieve all bookmark names in a database
- Returns:
of all bookmarks in a database
- Return type:
list
- static get_options_info() dict
Gets names, default values, and meta data for all Ringtail options.
- get_plot_data(bookmark_name: str = None)
Get ligand efficiency and energy for all docking data and for ligands that passed filtering in specified bookmark. Each tuple in the respective lists contains docking_score, leff, pose_id, and ligand name.
- Parameters:
bookmark_name (str)
- Returns:
[all_data], [filtered_data]
- Return type:
list(tuple), list(tuple)
- get_previous_filter_data(outfields=None, bookmark_name=None, log_file=None)
Get data requested in self.out_opts[‘outfields’] from the results bookmark of a previous filtering
- Parameters:
outfields (str) – use outfields as described in RingtailOptions > StorageOptions
bookmark_name (str) – bookmark for which the filters were used
- ligands_rdkit_mol(bookmark_name=None, write_nonpassing=False) dict
Creates a dictionary of RDKit mols of all ligands specified from a bookmark, either excluding (default) or including those ligands that did not pass the filter(s).
- Parameters:
bookmark_name (str, optional) – Option to run over specified bookmark other than that just used for filtering
write_nonpassing (bool, optional) – Option to include non-passing poses for passing ligands
- Returns:
containing ligand names, RDKit mols, flexible residue bols, and other ligand properties
- Return type:
all_mols (dict)
- plot(save=True, bookmark_name: str = None, return_fig_handle: bool = False)
Get data needed for creating Ligand Efficiency vs Energy scatter plot from storageManager. Call OutputManager to create plot.
- Parameters:
save (bool) – whether to save plot to cd
bookmark_name (str) – bookmark from which to fetch filtered data to plot
return_fig_handle (bool) – use to return a handle to the matplotlib figure instead of saving or showing figure
- Returns:
will not show figure if returning figure handle
- Return type:
matplotlib.pyplot.figure (optional)
- produce_summary(columns=['docking_score', 'leff'], percentiles=[1, 10]) None
Print summary of data in storage to sdout
- Parameters:
columns (list(str)) – data columns used to prepare summary
percentiles (list(int)) – cutoff percentiles for the summary
- save_receptor(receptor_file)
Add receptor to database.
- Parameters:
receptor_file (str) – path to receptor file
- set_filters(eworst=None, ebest=None, leworst=None, lebest=None, score_percentile=None, le_percentile=None, vdw_interactions=None, hb_interactions=None, reactive_interactions=None, hb_count=None, react_any=None, max_miss=None, ligand_name=None, ligand_operator=None, ligand_substruct=None, ligand_substruct_pos=None, ligand_max_atoms=None, dict: dict = None)
Create a filter object containing all numerical and string filters.
- Parameters:
eworst (float) – specify the worst energy value accepted
ebest (float) – specify the best energy value accepted
leworst (float) – specify the worst ligand efficiency value accepted
lebest (float) – specify the best ligand efficiency value accepted
score_percentile (float) – specify the worst energy percentile accepted. Express as percentage e.g. 1 for top 1 percent.
le_percentile (float) – specify the worst ligand efficiency percentile accepted. Express as percentage e.g. 1 for top 1 percent.
vdw_interactions (list[tuple]) – define van der Waals interactions with residue as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [(‘A:VAL:279:’, True), (‘A:LYS:162:’, True)] -> [(‘chain:resname:resid:atomname’, <wanted (bool)>), (‘chain:resname:resid:atomname’, <wanted (bool)>)]
hb_interactions (list[tuple]) – define HB (ligand acceptor or donor) interaction as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [(‘A:VAL:279:’, True), (‘A:LYS:162:’, True)] -> [(‘chain:resname:resid:atomname’, <wanted (bool)>), (‘chain:resname:resid:atomname’, <wanted (bool)>)]
reactive_interactions (list[tuple]) – check if ligand reacted with specified residue as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [(‘A:VAL:279:’, True), (‘A:LYS:162:’, True)] -> [(‘chain:resname:resid:atomname’, <wanted (bool)>), (‘chain:resname:resid:atomname’, <wanted (bool)>)]
hb_count (list[tuple]) – accept ligands with at least the requested number of HB interactions. If a negative number is provided, then accept ligands with no more than the requested number of interactions. E.g., [(‘hb_count’, 5)]
react_any (bool) – check if ligand reacted with any residue
max_miss (int) – Will compute all possible combinations of interaction filters excluding up to max_miss numer of interactions from given set. Default will only return union of poses interaction filter combinations. Use with ‘enumerate_interaction_combs’ for enumeration of poses passing each individual combination of interaction filters.
ligand_name (list[str]) – specify ligand name(s). Will combine name filters with OR, e.g., [“lig1”, “lig2”]
ligand_substruct (list[str]) – SMARTS, index of atom in SMARTS, cutoff dist, and target XYZ coords, e.g., [“ccc”, “CN”]
ligand_substruct_pos (list[str]) – SMARTS pattern(s) for substructure matching, e.g., [‘”[Oh]C” 0 1.2 -5.5 10.0 15.5’] -> [“smart_string index_of_positioned_atom cutoff_distance x y z”]
ligand_max_atoms (int) – Maximum number of heavy atoms a ligand may have
ligand_operator (str) – logical join operator for multiple SMARTS (default: OR), either AND or OR
dict (dict) – dictionary of one or more of the above args, is overwritten by individual args
- set_output_options(log_file: str = None, export_sdf_path: str = None, enumerate_interaction_combs: bool = None, dict: dict = None)
Creates output options object that holds attributes related to reading and outputting results. Will assign log_file name and export_sdf_path to the output_manager object.
- Parameters:
log_file (str) – by default, results are saved in “output_log.txt”; if this option is used, ligands and requested info passing the filters will be written to specified file
export_sdf_path (str) – specify the path where to save poses of ligands passing the filters (SDF format); if the directory does not exist, it will be created; if it already exist, it will throw an error, unless the –overwrite is used NOTE: the log file will be automatically saved in this path. Ligands will be stored as SDF files in the order specified.
enumerate_interaction_combs (bool) – When used with max_miss > 0, will log ligands/poses passing each separate interaction filter combination as well as union of combinations. Can significantly increase runtime.
dict (dict) – dictionary of one or more of the above args, is overwritten by individual args
- set_resultsman_attributes(store_all_poses: bool = None, max_poses: int = None, add_interactions: bool = None, interaction_tolerance: float = None, interaction_cutoffs: list = None, max_proc: int = None, dict: dict = None)
Create results_manager_options object if needed, sets options, and assigns them to the results manager object.
- Parameters:
store_all_poses (bool) – store all ligand poses, does it take precedence over max poses?
max_poses (int) – how many poses to save (ordered by soem score?)
add_interactions (bool) – add ligand-receptor interaction data, only in vina mode
interaction_tolerance (float) – longest ångström distance that is considered interaction?
interaction_cutoffs (list) – ångström distance cutoffs for x and y interaction
max_proc (int) – max number of computer processors to use for file reading
dict (dict) – dictionary of one or more of the above args, is overwritten by individual args
- set_storageman_attributes(filter_bookmark: str = None, duplicate_handling: str = None, overwrite: bool = None, order_results: str = None, outfields: str = None, output_all_poses: str = None, mfpt_cluster: float = None, interaction_cluster: float = None, bookmark_name: str = None, dict: dict = None)
Create storage_manager_options object if needed, sets options, and assigns them to the storage manager object.
- Parameters:
filter_bookmark (str) – Perform filtering over specified bookmark. (in output group in CLI)
duplicate_handling (str, options) – specify how duplicate Results rows should be handled when inserting into database. Options are “ignore” or “replace”. Default behavior will allow duplicate entries.
overwrite (bool) – by default, if a log file exists, it doesn’t get overwritten and an error is returned; this option enable overwriting existing log files. Will also overwrite existing database
order_results (str) – Stipulates how to order the results when written to the log file. By default will be ordered by order results were added to the database. ONLY TAKES ONE OPTION.” “available fields are: ” ‘“e” (docking_score), ‘ ‘“le” (ligand efficiency), ‘ ‘“delta” (delta energy from best pose), ‘ ‘“ref_rmsd” (RMSD to reference pose), ‘ ‘“e_inter” (intermolecular energy), ‘ ‘“e_vdw” (van der waals energy), ‘ ‘“e_elec” (electrostatic energy), ‘ ‘“e_intra” (intermolecular energy), ‘ ‘“n_interact” (number of interactions), ‘ ‘“rank” (rank of ligand pose), ‘ ‘“run” (run number for ligand pose), ‘ ‘“hb” (hydrogen bonds); ‘
outfields (str) – defines which fields are used when reporting the results (to stdout and to the log file); fields are specified as comma-separated values, e.g. “–outfields=e,le,hb”; by default, docking_score (energy) and ligand name are reported; ligand always reported in first column available fields are: ‘ ‘“Ligand_name” (Ligand name), ‘ ‘“e” (docking_score), ‘ ‘“le” (ligand efficiency), ‘ ‘“delta” (delta energy from best pose), ‘ ‘“ref_rmsd” (RMSD to reference pose), ‘ ‘“e_inter” (intermolecular energy), ‘ ‘“e_vdw” (van der waals energy), ‘ ‘“e_elec” (electrostatic energy), ‘ ‘“e_intra” (intermolecular energy), ‘ ‘“n_interact” (number of interactions), ‘ ‘“ligand_smile” , ‘ ‘“rank” (rank of ligand pose), ‘ ‘“run” (run number for ligand pose), ‘ ‘“hb” (hydrogen bonds), ‘ ‘“receptor” (receptor name); ‘ “Fields are printed in the order in which they are provided. Ligand name will always be returned and will be added in first position if not specified.
output_all_poses (bool) – By default, will output only top-scoring pose passing filters per ligand. This flag will cause each pose passing the filters to be logged.
mfpt_cluster (float) – Cluster filered ligands by Tanimoto distance of Morgan fingerprints with Butina clustering and output ligand with lowest ligand efficiency from each cluster. Default clustering cutoff is 0.5. Useful for selecting chemically dissimilar ligands.
interaction_cluster (float) – Cluster filered ligands by Tanimoto distance of interaction fingerprints with Butina clustering and output ligand with lowest ligand efficiency from each cluster. Default clustering cutoff is 0.5. Useful for enhancing selection of ligands with diverse interactions.
bookmark_name (str) – name for resulting book mark file. Default value is “passing_results”
dict (dict) – dictionary of one or more of the above args, is overwritten by individual args
- update_database_version(consent=False, new_version='2.0.0')
Method to update database version from earlier versions to either 1.1.0 or 2.0.0
- write_flexres_pdb(receptor_polymer, ligname: str, filename: str, bookmark_name: str = None)
Writes a receptor pdb with flexible residues based on the ligand provided
- Parameters:
receptor_polymer (Polymer) – version of receptor produced by meeko
ligname (str) – ligand name for which the receptor flexible residue info should be collected
filename (str) – name of the output pdb, extension is optional, will default to ‘.pdb’
bookmark_name (str, optional) – will use last used bookmark if not specified, will not work in a db without any filtering performed
- write_molecule_sdfs(sdf_path: str | None = None, all_in_one: bool = True, bookmark_name: str = None, write_nonpassing: bool = None)
Have output manager write molecule sdf files for passing results in given results bookmark
- Parameters:
sdf_path (str, optional) – Optional path existing or to be created in cd where SDF files will be saved
all_in_one (bool, optional) – If True will write all molecules to one SDF (separated by $$$$), if False will write one molecule pre SDF
bookmark_name (str, optional) – Option to run over specified bookmark other than that just used for filtering
write_nonpassing (bool, optional) – Option to include non-passing poses for passing ligands
- Raises:
StorageError – if bookmark or data not found
ringtail.ringtailoptions module
- class ringtail.ringtailoptions.Filters
Bases:
RTOptions
Object that holds all optional filters.
- checks()
Ensures all values are internally consistent and valid. Runs once after all values are set initially, then every time a value is changed.
- classmethod get_filter_keys(group) list
Provide keys associated with each of the filter groups. :param group: includese property filters, interaction filters, ligand filters, or all filters :type group: str
- Returns:
list of filter keywords associated with the specified group(s)
- options = {'ebest': {'default': None, 'description': 'Specify the best energy value accepted.', 'type': <class 'float'>}, 'eworst': {'default': None, 'description': 'Specify the worst energy value accepted.', 'type': <class 'float'>}, 'hb_count': {'default': None, 'description': "Accept ligands with at least the requested number of HB interactions. If a negative number is provided, then accept ligands with no more than the requested number of interactions. E.g., [('hb_count', 5)].", 'type': <class 'list'>}, 'hb_interactions': {'default': [], 'description': "Define HB (ligand acceptor or donor) interaction as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [('A:VAL:279:', True), ('A:LYS:162:', True)] -> [('chain:resname:resid:atomname', <wanted (bool)>), ('chain:resname:resid:atomname', <wanted (bool)>)].", 'type': <class 'list'>}, 'le_percentile': {'default': None, 'description': 'Specify the worst ligand efficiency percentile accepted. Express as percentage e.g. 1 for top 1 percent.', 'type': <class 'float'>}, 'lebest': {'default': None, 'description': 'Specify the best ligand efficiency value accepted.', 'type': <class 'float'>}, 'leworst': {'default': None, 'description': 'Specify the worst ligand efficiency value accepted.', 'type': <class 'float'>}, 'ligand_max_atoms': {'default': None, 'description': 'Maximum number of heavy atoms a ligand may have.', 'type': <class 'int'>}, 'ligand_name': {'default': None, 'description': "Specify list of ligand name(s). Will combine name filters with 'OR'", 'type': <class 'list'>}, 'ligand_operator': {'default': None, 'description': "Logical join operator for multiple substruct filters. Will apply within 'ligand_substruct' filters and within 'ligand_substruct_pos' filters (the two groups are always joined by 'AND').", 'type': <class 'str'>}, 'ligand_substruct': {'default': None, 'description': "SMARTS pattern(s) for substructure matching. Will be evaluated as 'this' OR 'that' unless specified by using the ligand_operator. If error delimit each substructure with ''.", 'type': <class 'list'>}, 'ligand_substruct_pos': {'default': None, 'description': "SMARTS pattern(s) for substructure matching. For API use list with six elements ['[Oh]C', 0, 1.2, -5.5, 10.0, 15.5] -> ['smart_string', index_of_positioned_atom, cutoff_distance, x, y, z]. For the CLI use as a string without comma separators, separating each filter with commas -> '[Oh]C 0 1.2 -5.5 10.0 15.5'. Will be evaluated as 'this' OR 'that' unless specified by using the ligand_operator", 'type': <class 'list'>}, 'max_miss': {'default': 0, 'description': "Will compute all possible combinations of interaction filters excluding up to 'max_miss' number of interactions from given set. Default will only return union of poses interaction filter combinations. Use with 'enumerate_interaction_combs' for enumeration of poses passing each individual combination of interaction filters.", 'type': <class 'int'>}, 'react_any': {'default': None, 'description': 'Check if ligand reacted with any residue.', 'type': <class 'bool'>}, 'reactive_interactions': {'default': [], 'description': "Check if ligand reacted with specified residue as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [('A:VAL:279:', True), ('A:LYS:162:', True)] -> [('chain:resname:resid:atomname', <wanted (bool)>), ('chain:resname:resid:atomname', <wanted (bool)>)].", 'type': <class 'list'>}, 'score_percentile': {'default': None, 'description': 'Specify the worst energy percentile accepted. Express as percentage e.g. 1 for top 1 percent.', 'type': <class 'float'>}, 'vdw_interactions': {'default': [], 'description': "Define van der Waals interactions with residue as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [('A:VAL:279:', True), ('A:LYS:162:', True)] -> [('chain:resname:resid:atomname', <wanted (bool)>), ('chain:resname:resid:atomname', <wanted (bool)>)].", 'type': <class 'list'>}}
- class ringtail.ringtailoptions.GeneralOptions
Bases:
RTOptions
Object that holds choices and default values for miscellaneous arguments used for the command line interface only.
- checks()
Ensures all values are internally consistent and valid. Runs once after all values are set initially, then every time a value is changed.
- options = {'db_file': {'default': 'output.db', 'description': 'DB file for which to use for all Ringtail activities.', 'type': <class 'str'>}, 'debug': {'default': None, 'description': 'Print additional error information to STDOUT and to log.', 'type': <class 'bool'>}, 'docking_mode': {'default': 'dlg', 'description': "specify AutoDock program used to generate results. Available options are 'DLG' and 'vina'. Will automatically change --file_pattern to *.dlg* for DLG and *.pdbqt* for vina.", 'type': <class 'str'>}, 'print_summary': {'default': None, 'description': 'prints summary information about stored data to STDOUT.', 'type': <class 'bool'>}, 'verbose': {'default': None, 'description': 'Print results passing filtering criteria to STDOUT and to log. NOTE: runtime may be slower option used.', 'type': <class 'bool'>}}
- class ringtail.ringtailoptions.InputFiles
Bases:
RTOptions
Class that handles sources of data to be written including ligand data paths and how to traverse them, and options to store receptor.
- checks()
Ensures all values are internally consistent and valid. Runs once after all values are set initially, then every time a value is changed.
- options = {'file': {'default': None, 'description': 'Ligand docking output file to save. Compressed (.gz) files allowed. Only results files associated the same receptor allowed.', 'type': <class 'list'>}, 'file_list': {'default': None, 'description': 'Text file(s) containing the list of docking output files to save; relative or absolute paths are allowed. Compressed (.gz) files allowed.', 'type': <class 'list'>}, 'file_path': {'default': None, 'description': 'Directory(s) containing docking output files to save. Compressed (.gz) files allowed', 'type': <class 'list'>}, 'file_pattern': {'default': None, 'description': "Specify which pattern to use when searching for result files to process (only with 'file_path').", 'type': <class 'str'>}, 'receptor_file': {'default': None, 'description': 'Use with Vina mode. Give file for receptor PDBQT.', 'type': <class 'str'>}, 'recursive': {'default': None, 'description': "Enable recursive directory scan when 'file_path' is used.", 'type': <class 'bool'>}, 'save_receptor': {'default': None, 'description': "Saves receptor PDBQT to database. Receptor location must be specied with in 'receptor_file'.", 'type': <class 'bool'>}, 'target': {'default': None, 'description': "Name of receptor. This field is autopopulated if 'receptor_file' is supplied.", 'type': <class 'str'>}}
- class ringtail.ringtailoptions.InputStrings
Bases:
RTOptions
Class that handles docking results strings from vina docking, with options to store receptor. Takes docking results string as a dictionary of: {ligand_name: docking_result}
- checks()
Ensures all values are internally consistent and valid. Runs once after all values are set initially, then every time a value is changed.
- options = {'receptor_file': {'default': None, 'description': 'Use with Vina mode. Give file for receptor PDBQT.', 'type': <class 'str'>}, 'results_strings': {'default': None, 'description': 'A dictionary of ligand names and ligand docking output results. Currently only valid for vina docking', 'type': <class 'dict'>}, 'save_receptor': {'default': None, 'description': "Saves receptor PDBQT to database. Receptor location must be specied with in 'receptor_file'.", 'type': <class 'bool'>}, 'target': {'default': None, 'description': "Name of receptor. This field is autopopulated if 'receptor_file' is supplied.", 'type': <class 'str'>}}
- class ringtail.ringtailoptions.OutputOptions
Bases:
RTOptions
Class that holds options related to reading and output from the database, including format for result export and alternate ways of displaying the data (plotting).
- checks()
Ensures all values are internally consistent and valid. Runs once after all values are set initially, then every time a value is changed.
- options = {'enumerate_interaction_combs': {'default': None, 'description': "When used with 'max_miss' > 0, will log ligands/poses passing each separate interaction filter combination as well as union of combinations. Can significantly increase runtime.", 'type': <class 'bool'>}, 'export_sdf_path': {'default': '', 'description': "Specify the path where to save poses of ligands passing the filters (SDF format); if the directory does not exist, it will be created; if it already exist, it will throw an error, unless the 'overwrite' is used NOTE: the log file will be automatically saved in this path. Ligands will be stored as SDF files in the order specified.", 'type': <class 'str'>}, 'individual_sdf_files': {'default': False, 'description': 'Use if you like to print chosen molecules to individual SDF files, as opposed to one big SDF.', 'type': <class 'bool'>}, 'log_file': {'default': 'output_log.txt', 'description': "By default, read and filtering results are saved in 'output_log.txt'; if this option is used, ligands and requested info passing the filters will be written to specified file.", 'type': <class 'str'>}}
- class ringtail.ringtailoptions.RTOptions
Bases:
object
Holds standard methods for the ringtail option child classes. Options can be added using this format: options = {
- “”:{
“default”:’’, “type”:’’, “description”: “”
},
}
- initialize_from_dict(dict: dict, name)
Initializes a child objects using the values available in their option dictionary.
- Parameters:
dict (dict) – of attributes to be initialized to the object
name (str) – name of the childclass/object
- classmethod is_valid_path(path)
Checks if path exist in current directory.
- Parameters:
path (str)
- Returns:
if path exist
- Return type:
bool
- todict()
Return class and its attributes as a dict of native types and not as objects (which they are if they are type checked using TypeSafe).
- static valid_bookmark_name(name) bool
Checks that bookmark name adheres to sqlite naming conventions of alphanumerical and limited symbols.
- Parameters:
name (str) – bookmark name
- Returns:
true if bookmark name is valid
- Return type:
bool
- class ringtail.ringtailoptions.ReadOptions
Bases:
RTOptions
Object that holds choices and default values for read and export modes, mostly used for the command line interface.
- checks()
Ensures all values are internally consistent and valid. Runs once after all values are set initially, then every time a value is changed.
- options = {'data_from_bookmark': {'default': None, 'description': 'Write log of --outfields data for bookmark specified by --bookmark_name. Must use without any filters.', 'type': <class 'bool'>}, 'export_bookmark_csv': {'default': None, 'description': 'Create csv of the bookmark given with bookmark_name. Output as <bookmark_name>.csv. Can also export full database tables.', 'type': <class 'str'>}, 'export_bookmark_db': {'default': None, 'description': 'Export a database containing only the results found in the bookmark specified by --bookmark_name. Will save as <input_db>_<bookmark_name>.db', 'type': <class 'bool'>}, 'export_query_csv': {'default': None, 'description': 'Create csv of the requested SQL query. Output as query.csv. MUST BE PRE-FORMATTED IN SQL SYNTAX e.g. SELECT [columns] FROM [table] WHERE [conditions]', 'type': <class 'str'>}, 'export_receptor': {'default': None, 'description': 'Export stored receptor pdbqt. Will write to current directory.', 'type': <class 'bool'>}, 'find_similar_ligands': {'default': None, 'description': 'Allows user to find similar ligands to given ligand name based on previously performed morgan fingerprint or interaction clustering.', 'type': <class 'str'>}, 'plot': {'default': None, 'description': 'Makes scatterplot of LE vs Best Energy, saves as scatter.png.', 'type': <class 'bool'>}, 'pymol': {'default': None, 'description': 'Lauch PyMOL session and plot of ligand efficiency vs docking score for molecules in bookmark specified with --bookmark_name. Will display molecule in PyMOL when clicked on plot. Will also open receptor if given.', 'type': <class 'bool'>}}
- class ringtail.ringtailoptions.ResultsProcessingOptions
Bases:
RTOptions
Class that holds database write options that affects write time, such as how to break up data files, number of computer processes to use, and and how many poses to store.
- checks()
Ensures all values are internally consistent and valid. Runs once after all values are set initially, then every time a value is changed.
- options = {'add_interactions': {'default': False, 'description': "Find interactions between ligand poses and receptor and save to database. Requires receptor PDBQT to be given with input files (all modes) and 'receptor_file' to be specified with Vina mode. SIGNIFICANTLY INCREASES DATBASE WRITE TIME.", 'type': <class 'bool'>}, 'interaction_cutoffs': {'default': [3.7, 4.0], 'description': "Use with 'add_interactions', specify distance cutoffs for measuring interactions between ligand and receptor in angstroms. Give as string, separating cutoffs for hydrogen bonds and VDW with comma (in that order). E.g. '-ic 3.7,4.0' will set the cutoff for hydrogen bonds to 3.7 angstroms and for VDW to 4.0. These are the default cutoffs.", 'type': <class 'list'>}, 'interaction_tolerance': {'default': None, 'description': 'Will add the interactions for poses within some tolerance RMSD range of the top pose in a cluster to that top pose. Can use as flag with default tolerance of 0.8 for cmd line tool, or give other value as desired (cmd line and api). Only compatible with ADGPU mode.', 'type': <class 'float'>}, 'max_poses': {'default': 3, 'description': 'Store top pose for top n clusters.', 'type': <class 'int'>}, 'max_proc': {'default': None, 'description': 'Maximum number of processes to create during parallel file parsing. Defaults to number of CPU processors.', 'type': <class 'int'>}, 'store_all_poses': {'default': False, 'description': "Store all poses from input files. Overrides 'max_poses'.", 'type': <class 'bool'>}}
- class ringtail.ringtailoptions.StorageOptions
Bases:
RTOptions
Class that handles options for the storage (database) manager class, including conflict handling, and results clustering and ordering.
- checks()
Ensures all values are internally consistent and valid. Runs once after all values are set initially, then every time a value is changed.
- options = {'bookmark_name': {'default': 'passing_results', 'description': "name for resulting book mark file. Default value is 'passing_results'", 'type': <class 'str'>}, 'duplicate_handling': {'default': None, 'description': "Specify how duplicate Results rows should be handled when inserting into database. Options are 'ignore' or 'replace'. Default behavior (no option provided) will allow duplicate entries.", 'type': <class 'str'>}, 'filter_bookmark': {'default': None, 'description': 'Perform filtering over specified bookmark.', 'type': <class 'str'>}, 'interaction_cluster': {'default': None, 'description': 'Cluster filtered ligands by Tanimoto distance of interaction fingerprints with Butina clustering and output ligand with lowest ligand efficiency from each cluster. Useful for enhancing selection of ligands with diverse interactions.', 'type': <class 'float'>}, 'mfpt_cluster': {'default': None, 'description': 'Cluster filtered ligands by Tanimoto distance of Morgan fingerprints with Butina clustering and output ligand with lowest ligand efficiency from each cluster. Useful for selecting chemically dissimilar ligands.', 'type': <class 'float'>}, 'order_results': {'default': None, 'description': "Stipulates how to order the results when written to the log file. By default will be ordered by order results were added to the database. ONLY TAKES ONE OPTION.\n available fields are: \n 'e' (docking_score), \n 'le' (ligand efficiency), \n 'delta' (delta energy from best pose), \n 'ref_rmsd' (RMSD to reference pose), \n 'e_inter' (intermolecular energy), \n 'e_vdw' (van der waals energy), \n 'e_elec' (electrostatic energy), \n 'e_intra' (intermolecular energy), \n 'n_interact' (number of interactions), \n 'rank' (rank of ligand pose), \n 'run' (run number for ligand pose), \n 'hb' (hydrogen bonds); ", 'type': <class 'str'>}, 'outfields': {'default': 'Ligand_name,e', 'description': "Defines which fields are used when reporting the results (to stdout and to the log file). Fields are specified as comma-separated values, e.g. 'outfields=e,le,hb'; by default, docking_score (energy) and ligand name are reported. Ligand always reported in first column available fields are: \n\n 'Ligand_name' (Ligand name), \n 'e' (docking_score), \n 'le' (ligand efficiency), \n 'delta' (delta energy from best pose), \n 'ref_rmsd' (RMSD to reference pose), \n 'e_inter' (intermolecular energy), \n 'e_vdw' (van der waals energy), \n 'e_elec' (electrostatic energy), \n 'e_intra' (intermolecular energy), \n 'n_interact' (number of iteractions), \n 'ligand_smile' , \n 'rank' (rank of ligand pose), \n 'run' (run number for ligand pose), \n 'hb' (hydrogen bonds), \n 'receptor' (receptor name);", 'type': <class 'str'>}, 'output_all_poses': {'default': None, 'description': 'By default, will output only top-scoring pose passing filters per ligand. This flag will cause each pose passing the filters to be logged.', 'type': <class 'bool'>}, 'overwrite': {'default': None, 'description': "This option will allow overwriting of the database (in 'write'/add files-mode) and filtering log_file (in 'read'/filtering mode).", 'type': <class 'bool'>}}
- order_options = {'delta', 'e', 'e_elec', 'e_inter', 'e_intra', 'e_vdw', 'hb', 'le', 'n_interact', 'rank', 'ref_rmsd', 'run'}
- class ringtail.ringtailoptions.TypeSafe(default, type, object_name)
Bases:
object
Class that handles safe typesetting of values of a specified built-in type. Any attribute can be set as a TypeSafe object, this ensures its type is checked whenever it is changed. This makes the attribute of type ‘object’ as opposed to its actual type. To return the value of an attribute as a native type value, you can create a ‘__getattribute__’ method in the class that holds the attribute (see e.g., RTOptions).
It is the hope to extend this to work with custom types, such as “percentage” (float with a max and min value), and direcotry (string that must end with ‘/’).
- Parameters:
object_name (str) – name of type safe instance
type (type) – any of the native types in python that the instance must adhere to
default (any) – default value of the object, can be any including None
value (any) – value of type type assigned to instance, can be same or different than default
- Raises:
OptionError – if wrong type is attempted.
ringtail.storagemanager module
- class ringtail.storagemanager.StorageManager
Bases:
object
- check_passing_bookmark_exists(bookmark_name: str | None = None)
Checks if bookmark name is in database
- Parameters:
bookmark_name (str, optional) – name of bookmark name to check if exist, or else will use storageman bookmark_name attribute
- Returns:
indicates if bookmark_name exists in the current database
- Return type:
bool
- check_storage_compatibility()
Checks if chosen storage type has been implemented
- Parameters:
storage_type (str) – name of the storage type
- Raises:
NotImplementedError – raised if seelected storage type has not been implemented
- Returns:
of implemented storage type
- Return type:
class
- close_storage(attached_db=None, vacuum=False)
Close connection to database
- Parameters:
attached_db (str, optional) – name of attached DB (not including file extension)
vacuum (bool, optional) – indicates that database should be vacuumed before closing
- crossref_filter(new_db: str, bookmark1_name: str, bookmark2_name: str, selection_type='-', old_db=None) tuple
Selects ligands found or not found in the given bookmark in both current db and new_db. Stores as temp view
- Parameters:
new_db (str) – file name for database to attach
bookmark1_name (str) – string for name of first bookmark/temp table to compare
bookmark2_name (str) – string for name of second bookmark to compare
selection_type (str) – “+” or “-” indicating if ligand names should (“+”) or should not “-” be in both databases
old_db (str, optional) – file name for previous database
- Returns:
(name of new bookmark (str), number of ligands passing new bookmark (int))
- Return type:
tuple
- field_to_column_name = {'Ligand_name': 'LigName', 'delta': 'deltas', 'e': 'docking_score', 'e_elec': 'energies_electro', 'e_inter': 'energies_inter', 'e_intra': 'energies_intra', 'e_vdw': 'energies_vdw', 'hb': 'num_hb', 'interactions': 'interactions', 'le': 'leff', 'ligand_smile': 'ligand_smile', 'n_interact': 'nr_interactions', 'rank': 'pose_rank', 'receptor': 'receptor', 'ref_rmsd': 'reference_rmsd', 'run': 'run_number'}
- filter_results(all_filters: dict, suppress_output=False) iter
Generate and execute database queries from given filters.
- Parameters:
all_filters (dict) – dict containing all filters. Expects format and keys corresponding to ringtail.Filters().todict()
suppress_output (bool) – prints filtering summary to sdout
- Returns:
iterable, such as an sqlite cursor, of passing results
- Return type:
iter
- finalize_database_write()
Methods to finalize when a database has been written to, and saving the current database schema to the sqlite database.
- get_plot_data(bookmark_name: str = None, only_passing=False)
This function is expected to return an ascii plot representation of the results
- Parameters:
bookmark_name (str) – name of bookmark for which to fetch passing data. Will use default bookmark name if None. Returns empty list if bookmark does not exist.
only_passing (bool) – Only return data for passing ligands. Will return empty list for all data.
- Returns:
cursors as (<all data cursor>, <passing data cursor>)
- Return type:
tuple
- insert_data(results_array, ligands_array, interaction_list, receptor_array=[], insert_receptor=False)
Inserts data from all arrays returned from results manager.
- Parameters:
results_array (list) – list of data to be stored in Results table
ligands_array (list) – list of data to be stored in Ligands table
interaction_list (list) – list of data to be stored in interaction tables
receptor_array (list) – list of data to be stored in Receptors table
insert_receptor (bool, optional) – flag indicating that receptor info should inserted
- insert_interactions(Pose_IDs: list, interactions_list, duplicates)
Takes list of interactions, inserts into database
- Parameters:
Pose_IDs (list(int)) – list of pose ids assigned while writing the current results to database
interactions_list (list) – List of tuples for interactions in form (“type”, “chain”, “residue”, “resid”, “recname”, “recid”)
duplicates (list(Pose_ID)) – any duplicates identified in “insert_results”, if duplicate handling has been specified
- prune()
Deletes rows from results, ligands, and interactions in a bookmark if they do not pass filtering criteria
- class ringtail.storagemanager.StorageManagerSQLite(db_file: str = None, overwrite: bool = None, order_results: str = None, outfields: str = None, filter_bookmark: str = None, output_all_poses: bool = None, mfpt_cluster: float = None, interaction_cluster: float = None, bookmark_name: str = None, duplicate_handling: str = None)
Bases:
StorageManager
SQLite-specific StorageManager subclass
- conn
Connection to database
- Type:
SQLite.conn
- open_cursors
list of cursors that were not closed by the function that created them. Will be closed by close_connection method.
- Type:
list
- db_file
database name
- Type:
str
- overwrite
switch to overwrite database if it exists
- Type:
bool
- order_results
what column name will be used to order results once read
- Type:
str
- outfields
data fields/columns to include when reading and outputting data
- Type:
str
- filter_bookmark
name of bookmark that filtering will be performed over
- Type:
str
- output_all_poses
whether or not to output all poses of a ligand
- Type:
bool
- mfpt_cluster
distance in ångströms to cluster ligands based on morgan fingerprints
- Type:
float
- interaction_cluster
distance in ångströms to cluster ligands based on interactions
- Type:
float
- bookmark_name
name of current bookmark being written to or read from
- Type:
str
- duplicate_handling
optional attribute to deal with insertion of ligands already in the database
- Type:
str
- current_bookmark_name
name of last view to have been written to in the database
- Type:
str
- filtering_window
name of bookmark/view being filtered on
- Type:
str
- index_columns
- Type:
list
- view_suffix
current suffix for views
- Type:
int
- temptable_suffix
current suffix for temporary tables
- Type:
int
- field_to_column_name
Dictionary for converting ringtail options into DB column names
- Type:
dict
- bookmark_has_rows(bookmark_name: str) bool
Method that checks if a given bookmark has any data in it
- Parameters:
bookmark_name (str) – view to check
- Returns:
True if more than zero rows in bookmark
- Return type:
bool
- check_ringtaildb_version()
Checks the database version and confirms whether the code base is compatible with it
- Returns:
whether or not db is compatible with the code base str: current database versions
- Return type:
bool
- check_storage_ready(run_mode: str, docking_mode: str, store_all_poses: bool, max_poses: int)
Check that storage is ready before proceeding, and creates new tables if needed
- Parameters:
run_mode (str) – if ringtail is ran using cmd line interface or api
docking_mode (str) – what docking engine was used to produce results
store_all_poses (bool) – overrwrites max poses
max_poses (int) – max poses to save to db
- Raises:
OptionError – if database options are not compatible
- clone(backup_name=None)
Creates a copy of the db
- Parameters:
backup_name (str, optional) – name of the cloned database
- count_receptors_in_db()
returns number of rows in Receptors table where receptor_object already has blob
- Returns:
number of rows in receptors table str: name of receptor if present in table
- Return type:
int
- Raises:
- create_bookmark(name, query, temp=False, add_poseID=False, filters={})
Takes name and selection query and creates a bookmark of name. Bookmarks are Ringtail specific views that whose information is stored in the ‘Bookmark’ table. #FIXME bug where ligand filter only results are not added as bookmarks
- Parameters:
name (str) – Name for bookmark which will be created
query (str) – SQLite-formated query used to create bookmark
temp (bool, optional) – Flag if bookmark should be temporary
add_poseID (bool, optional) – Add Pose_ID column to bookmark
filters (dict, optional) – a dict of filters used to construct the query
- create_bookmark_from_temp_table(temp_table_name, bookmark_name, original_bookmark_name, wanted_list, unwanted_list=[])
Resaves temp bookmark stored in self.current_bookmark_name as new permenant bookmark
- Parameters:
bookmark_name (str) – name of bookmark to save last temp bookmark as
original_bookmark_name (str) – name of original bookmark
wanted_list (list) – List of wanted database names
unwanted_list (list, optional) – List of unwanted database names
temp_table_name (str) – name of temporary table
- create_temp_table_from_bookmark()
Method that creates a temporary table named “passing_temp”. Please note that this table will be dropped as soon as the database connection closes.
- drop_bookmark(bookmark_name: str)
Drops specified bookmark from database
- Parameters:
bookmark_name (str) – bookmark to be dropped
- Raises:
- fetch_bookmark(bookmark_name: str) Cursor
returns SQLite cursor of all fields in bookmark
- Parameters:
bookmark_name (str) – name of bookmark to retrieve
- Returns:
cursor of requested view
- Return type:
sqlite3.Cursor
- fetch_clustered_similars(ligname: str)
Given ligname, returns poseids for similar poses/ligands from previous clustering. User prompted at runtime to choose cluster.
- Parameters:
ligname (str) – ligname for ligand to find similarity with
- Raises:
ValueError – wrong terminal input
- fetch_data_for_passing_results() iter
Will return SQLite cursor with requested data for outfields for poses that passed filter in self.bookmark_name
- Returns:
sqlite cursor of data from passing data
- Return type:
iter
- Raises:
- fetch_filters_from_bookmark(bookmark_name: str | None = None)
Method that will retrieve filter values used to construct bookmark
- Parameters:
bookmark_name (str, optional) – can get filter values for given bookmark, or filter values from currently active bookmark in storageman
Returns – dict: containing the filter data
- fetch_flexres_info()
fetch flexible residues names and atomname lists
- Returns:
(flexible_residues, flexres_atomnames)
- Return type:
tuple
- fetch_interaction_info_by_index(interaction_idx) tuple
Returns tuple containing interaction info for given interaction_idx
- Parameters:
interaction_idx (int) – interaction index to fetch info for
- Returns:
tuple of info for requested interaction
- Return type:
tuple
- fetch_nonpassing_pose_properties(ligname)
fetch coordinates for poses of ligname which did not pass the filter
- Parameters:
ligname (str) – name of ligand to fetch coordinates for
- Returns:
- SQLite cursor that contains Pose_ID, docking_score, leff, ligand_coordinates,
flexible_res_coordinates, flexible_residues
- Return type:
iter
- fetch_passing_ligand_output_info() iter
fetch information required by vsmanager for writing out molecules
- Returns:
- contains LigName, ligand_smile,
atom_index_map, hydrogen_parents
- Return type:
iter
- fetch_passing_pose_properties(ligname)
fetch coordinates for poses passing filter for given ligand
- Parameters:
ligname (str) – name of ligand to fetch coordinates for
- Returns:
- SQLite cursor that contains Pose_ID, docking_score, leff, ligand_coordinates,
flexible_res_coordinates, flexible_residues
- Return type:
iter
- fetch_pose_interactions(Pose_ID) iter
Fetch all interactions parameters belonging to a Pose_ID
- Parameters:
Pose_ID (int) – pose id, 1-1 with Results table
- Returns:
of interaction information for given Pose_ID
- Return type:
iter
- fetch_receptor_object_by_name(rec_name)
Returns Receptor object from database for given rec_name
- Parameters:
rec_name (str) – Name of receptor to return object for
Returns: str: receptor object as a string
- fetch_receptor_objects()
Returns all Receptor objects from database
- Parameters:
rec_name (str) – Name of receptor to return object for
- Returns:
of receptor names and objects
- Return type:
iter (tuple)
- fetch_single_ligand_output_info(ligname) str
get output information for given ligand
- Parameters:
ligname (str) – ligand name
- Raises:
- Returns:
information containing smiles, atom and index mapping, and hydrogen parents
- Return type:
str
- fetch_single_pose_properties(pose_ID: int) iter
fetch coordinates for pose given by pose_ID
- Parameters:
pose_ID (int) – name of ligand to fetch coordinates for
- Returns:
- SQLite cursor that contains Pose_ID, docking_score, leff, ligand_coordinates,
flexible_res_coordinates, flexible_residues
- Return type:
iter
- fetch_summary_data(columns=['docking_score', 'leff'], percentiles=[1, 10]) dict
- Collect summary data for database:
Num Ligands Num stored poses Num unique interactions
min, max, percentiles for columns in columns
- Parameters:
columns (list (str)) – columns to be displayed and used in summary
percentiles (list(int)) – percentiles to consider
- Returns:
of data summary
- Return type:
dict
- classmethod format_for_storage(ligand_dict: dict) tuple
takes file dictionary from the file parser, formats required storage format
- Parameters:
ligand_dict (dict) – Dictionary containing data from the fileparser
- Returns:
- of lists ([result_row_1, result_row_2,…],
ligand_row, [interaction_tuple_1, interaction_tuple_2, …])
- Return type:
tuple
- get_all_bookmark_names()
Get all bookmarks in sql database as a list of names. Bookmarks are a specific type of sqlite-views whose information is stored in the Bookmarks table.
- Returns:
of bookmark names
- Return type:
list
- get_current_bookmark_name()
returns current bookmark name
- Returns:
name of last passing results bookmark used by database
- Return type:
str
- get_maxmiss_union(total_combinations: int)
Get results that are in union considering max miss
- Parameters:
total_combinations (int) – numer of possible combinations
- Returns:
of passing results
- Return type:
iter
- insert_receptor_blob(receptor, rec_name)
Takes object of Receptor class, updates the column in Receptor table
- Parameters:
receptor (bytes) – bytes receptor object to be inserted into DB
rec_name (string) – Name of receptor. Used to insert into correct row of DB
- Raises:
DatabaseInsertionError – Description
- overwrite_storage()
Will drop all tables in the database.
- set_bookmark_suffix(suffix)
Sets internal bookmark_suffix variable
- Parameters:
suffix (str) – suffix to attached to bookmark-related queries or creation
- to_dataframe(requested_data: str, table=True) pandas.DataFrame
Returns a panda dataframe of table or query given as requested_data
- Parameters:
requested_data (str) – String containing SQL-formatted query or table name
table (bool) – Flag indicating if requested_data is table name or not
- Returns:
dataframe of requested data
- Return type:
pd.DataFrame
- update_database_version(new_version, consent=False)
method that updates sqlite database schema 1.0.0 or 1.1.0 to 1.1.0 or 2.0.0
#NOTE: If you created a version 1 database with the duplicate handling option, there is a chance of inconsistent behavior of anything involving interactions as the Pose_ID was not used as an explicit foreign key in db v1.0.0 and v1.1.0.
- Parameters:
consent (bool, optional) – variable to ensure consent to update database is explicit
- Returns:
bool
Module contents
- class ringtail.CLOptionParser
Bases:
object
Command line option/argument parser. Options and switches are utilized in the script ‘rt_process_vs.py’.
- process_mode
operating in ‘write’ or ‘read’ mode
- Type:
str
- rtcore
ringtail core object initialized with the provided db_file
- Type:
- filters
fully parsed and organied optional filters
- Type:
dict
- file_sources
fully parsed docking results and receptor files
- Type:
dict
- writeopts
fully parsed arguments related to database writing
- Type:
dict
- storageopts
fully parsed arguments related to how the storage system behaves
- Type:
dict
- outputopts
fully parsed arguments related to output and reading from the database
- Type:
dict
- print_summary
switch to print database summary
- Type:
bool
- filtering
switch to run filtering method
- Type:
bool
- plot
switch to plot the data
- Type:
bool
- export_bookmark_db
switch to export bookmark as a new database
- Type:
bool
- export_receptor
switch to export receptor information to pdbqt
- Type:
bool
- pymol
switch to visualize ligands in pymol
- Type:
bool
- data_from_bookmark
switch to write bookmark data to the output log file
- Type:
bool
- Raises:
OptionError – Error when an option cannot be parsed correctly
- process_options(parsed_opts)
Process and organize command line options to into ringtail options and filter dictionaries and ringtail core attributes
- Parameters:
parsed_opts (argparse.Namespace) – arguments provided through the cmdline_parser method.
- exception ringtail.DatabaseConnectionError
Bases:
StorageError
- exception ringtail.DatabaseInsertionError
Bases:
StorageError
- exception ringtail.DatabaseQueryError
Bases:
StorageError
- exception ringtail.DatabaseTableCreationError
Bases:
StorageError
- exception ringtail.DatabaseViewCreationError
Bases:
StorageError
- class ringtail.DockingFileReader(*args: Any, **kwargs: Any)
Bases:
Process
This class is the individual worker for processing docking results. One instance of this class is instantiated for each available processor.
- queueIn
current queue for the processor/file reader
- Type:
multiprocess.Queue
- queueOut
queue for the processor/file reader after adding or removing an item
- Type:
multiprocess.Queue
- pipe_conn
pipe connection to the reader
- Type:
multiprocess.Pipe
- storageman
storageman object
- Type:
- storageman_class
storagemanager child class/database type
- Type:
- docking_mode
describes what docking engine was used to produce the results
- Type:
str
- max_poses
max number of poses to store for each ligand
- Type:
int
- interaction_tolerance
Will add the interactions for poses within some tolerance RMSD range of the top pose in a cluster to that top pose.”
- Type:
float
- store_all_poses
Store all poses from docking results
- Type:
bool
- add_interactions
find and save interactions between ligand poses and receptor
- Type:
bool
- interaction_cutoffs
cutoff for interactions of hydrogen bonds and VDW interactions, in ångströms
- Type:
list(float)
- target
receptor name
- Type:
str
- run()
Method overload from parent class .This is where the task of this class is performed. Each multiprocess.Process class must have a “run” method which is called by the initialization (see below) with start()
- Raises:
NotImplementedError – if parser for specific docking result type is not implemented
- exception ringtail.FileParsingError
Bases:
Exception
- class ringtail.Filters
Bases:
RTOptions
Object that holds all optional filters.
- checks()
Ensures all values are internally consistent and valid. Runs once after all values are set initially, then every time a value is changed.
- classmethod get_filter_keys(group) list
Provide keys associated with each of the filter groups. :param group: includese property filters, interaction filters, ligand filters, or all filters :type group: str
- Returns:
list of filter keywords associated with the specified group(s)
- options = {'ebest': {'default': None, 'description': 'Specify the best energy value accepted.', 'type': <class 'float'>}, 'eworst': {'default': None, 'description': 'Specify the worst energy value accepted.', 'type': <class 'float'>}, 'hb_count': {'default': None, 'description': "Accept ligands with at least the requested number of HB interactions. If a negative number is provided, then accept ligands with no more than the requested number of interactions. E.g., [('hb_count', 5)].", 'type': <class 'list'>}, 'hb_interactions': {'default': [], 'description': "Define HB (ligand acceptor or donor) interaction as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [('A:VAL:279:', True), ('A:LYS:162:', True)] -> [('chain:resname:resid:atomname', <wanted (bool)>), ('chain:resname:resid:atomname', <wanted (bool)>)].", 'type': <class 'list'>}, 'le_percentile': {'default': None, 'description': 'Specify the worst ligand efficiency percentile accepted. Express as percentage e.g. 1 for top 1 percent.', 'type': <class 'float'>}, 'lebest': {'default': None, 'description': 'Specify the best ligand efficiency value accepted.', 'type': <class 'float'>}, 'leworst': {'default': None, 'description': 'Specify the worst ligand efficiency value accepted.', 'type': <class 'float'>}, 'ligand_max_atoms': {'default': None, 'description': 'Maximum number of heavy atoms a ligand may have.', 'type': <class 'int'>}, 'ligand_name': {'default': None, 'description': "Specify list of ligand name(s). Will combine name filters with 'OR'", 'type': <class 'list'>}, 'ligand_operator': {'default': None, 'description': "Logical join operator for multiple substruct filters. Will apply within 'ligand_substruct' filters and within 'ligand_substruct_pos' filters (the two groups are always joined by 'AND').", 'type': <class 'str'>}, 'ligand_substruct': {'default': None, 'description': "SMARTS pattern(s) for substructure matching. Will be evaluated as 'this' OR 'that' unless specified by using the ligand_operator. If error delimit each substructure with ''.", 'type': <class 'list'>}, 'ligand_substruct_pos': {'default': None, 'description': "SMARTS pattern(s) for substructure matching. For API use list with six elements ['[Oh]C', 0, 1.2, -5.5, 10.0, 15.5] -> ['smart_string', index_of_positioned_atom, cutoff_distance, x, y, z]. For the CLI use as a string without comma separators, separating each filter with commas -> '[Oh]C 0 1.2 -5.5 10.0 15.5'. Will be evaluated as 'this' OR 'that' unless specified by using the ligand_operator", 'type': <class 'list'>}, 'max_miss': {'default': 0, 'description': "Will compute all possible combinations of interaction filters excluding up to 'max_miss' number of interactions from given set. Default will only return union of poses interaction filter combinations. Use with 'enumerate_interaction_combs' for enumeration of poses passing each individual combination of interaction filters.", 'type': <class 'int'>}, 'react_any': {'default': None, 'description': 'Check if ligand reacted with any residue.', 'type': <class 'bool'>}, 'reactive_interactions': {'default': [], 'description': "Check if ligand reacted with specified residue as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [('A:VAL:279:', True), ('A:LYS:162:', True)] -> [('chain:resname:resid:atomname', <wanted (bool)>), ('chain:resname:resid:atomname', <wanted (bool)>)].", 'type': <class 'list'>}, 'score_percentile': {'default': None, 'description': 'Specify the worst energy percentile accepted. Express as percentage e.g. 1 for top 1 percent.', 'type': <class 'float'>}, 'vdw_interactions': {'default': [], 'description': "Define van der Waals interactions with residue as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [('A:VAL:279:', True), ('A:LYS:162:', True)] -> [('chain:resname:resid:atomname', <wanted (bool)>), ('chain:resname:resid:atomname', <wanted (bool)>)].", 'type': <class 'list'>}}
- class ringtail.InteractionFinder(rec_string, interaction_cutoff_radii)
Bases:
object
Class for handling and calculating ligand-receptor interactions.
- rec_string
string describing the receptor
- Type:
str
- interaction_cutoff_radii
cutoff for interactions of hydrogen bonds and VDW interactions, in ångströms
- Type:
list(float)
- find_pose_interactions(lig_atomtype_list: list, lig_coordinates: list) dict
Method that identifies interactions for a pose within th given cutoff distances in the main class.
- Parameters:
lig_atomtype_list (list) – list of atoms in the ligand
lig_coordinates (list) – coordinates for the atoms in the ligand
- Returns:
all interaction details for a given ligand pose
- Return type:
dict
- class ringtail.MPManager(docking_mode, max_poses, interaction_tolerance, store_all_poses, add_interactions, interaction_cutoffs, max_proc, storageman, storageman_class, chunk_size, target, receptor_file, file_pattern=None, file_sources=None, string_sources=None)
Bases:
object
Manager that orchestrates paralell processing of docking results data, using one of the supported multiprocessors.
- docking_mode
describes what docking engine was used to produce the results
- Type:
str
- max_poses
max number of poses to store for each ligand
- Type:
int
- interaction_tolerance
Will add the interactions for poses within some tolerance RMSD range of the top pose in a cluster to that top pose.”
- Type:
float
- store_all_poses
Store all poses from docking results
- Type:
bool
- add_interactions
find and save interactions between ligand poses and receptor
- Type:
bool
- interaction_cutoffs
cutoff for interactions of hydrogen bonds and VDW interactions, in ångströms
- Type:
list(float)
- max_proc
Maximum number of processes to create during parallel file parsing.
- Type:
int
- storageman
storageman object
- Type:
- storageman_class
storagemanager child class/database type
- Type:
- chunk_size
how many tasks ot send to a processor at the time
- Type:
int
- target
name of receptor
- Type:
str
- receptor_file
file path to receptor
- Type:
str
- file_pattern
file pattern to look for if recursively finding results files to process
- Type:
str, optional
- file_sources
RingtailOption object that holds all attributes related to results files
- Type:
InputFiles, optional
- string_sources
RingtailOption object that holds all attributes related to results strings
- Type:
InputStrings, optional
- num_files
number of files processed at any given time
- Type:
int
- process_results()
Processes results data (files or string sources) by adding them to the queue and starting their processing in multiprocess.
- exception ringtail.MultiprocessingError
Bases:
Exception
- exception ringtail.OptionError
Bases:
Exception
- exception ringtail.OutputError
Bases:
Exception
- class ringtail.OutputManager(log_file=None, export_sdf_path=None)
Bases:
object
Class for creating outputs, can be a context manager to handle log files
- log_file
name for log file
- Type:
str
- export_sdf_path
path for exporting SDF molecule files
- Type:
str
- _log_open
if log file is open or not
- Type:
bool
- close_logfile()
Closes the log file properly and reset file pointer to filename
- log_num_passing_ligands(number_passing_ligands: int)
Write the number of ligands which pass given filter to log file
- Parameters:
number_passing_ligands (int) – number of ligands that passed filter
- Raises:
- open_logfile(write_filters_header=True)
Opens log file and creates it if needed
- Parameters:
write_filters_header (bool) – only used because one method does not take the same headers
- Raises:
- plot_all_data(xdata, ydata, num_of_bins: int = 100)
Takes dictionary of binned data where key is the coordinates of the bin and value is the number of points in that bin. Adds to scatter plot colored by value
- Parameters:
xdata (list) – list of x axis data (needs to be same length as ydata)
ydata (list) – list of y axis data (needs to be same length as xdata)
num_of_bins (int) – number of bins to organize data in
- Returns:
matplotlib.pyplot.figure
- Raises:
- plot_single_points(x: list, y: list, markersize: int = 20, color='crimson')
Add points to scatter plot with given x and y coordinates and color.
- Parameters:
x (float) – x coordinate
y (float) – y coordinate
color (str, optional) – Color for point. Default black.
- Raises:
- save_scatterplot()
Saves current figure as scatter.png
- Raises:
- scatter_hist(x, y, z, ax_histx, ax_histy)
Makes scatterplot with a histogram on each axis
- Parameters:
x (list) – x coordinates for data
y (list) – y coordinates for data
z (list) – z coordinates for data
ax (matplotlib.axis) – scatterplot axis
ax_histx (matplotlib.axis) – x histogram axis
ax_histy (matplotlib.axis) – y histogram axis
- Raises:
- write_filter_log(lines)
Writes lines from results iterable into log file
- Parameters:
lines (iterable) – Iterable with tuples of data for writing into log
- Raises:
- Returns:
number of ligands passing that are written to log file
- Return type:
int
- write_filters_to_log(filters_dict, included_interactions, additional_info='')
Takes dictionary of filters, formats as string and writes to log file
- Parameters:
filters_dict (dict) – dictionary with filtering options
included_interactions (list) – types of interactions to include in the filtering
additional_info (str) – any additional information to write to top of log file
- Raises:
- write_find_similar_header(query_ligname, cluster_name)
Properly formats header for the log file find_similar_ligands
- write_maxmiss_union_header()
Properly formats header for the log file if using max_miss and enumerate_interaction_combs
- write_out_mol(filename, mol, flexres_mols, properties)
Writes out given mol as sdf. Will create the specified sdf folder in current working directory if needed.
- Parameters:
filename (str) – name of SDF file that will be written to
mol (RDKit.Chem.Mol) – RDKit molobject to be written to SDF
flexres_mols (list) – dictionary of rdkit molecules for flexible residues
properties (dict) – dictionary of list of properties to add to mol before writing
- Raises:
- write_receptor_pdbqt(recname: str, receptor_compbytes)
Writes a pdbqt file from receptor “blob”
- Parameters:
recname (str) – name of receptor to use in output filename
receptor_compbytes (blob) – receptor blob
- write_results_bookmark_to_log(bookmark_name)
Write the name of the result bookmark into log
- Parameters:
bookmark_name (str) – name of current results’ bookmark in db
- Raises:
- exception ringtail.RTCoreError
Bases:
Exception
- class ringtail.ReceptorManager
Bases:
object
Class with methods dealing with formatting of receptor information
- static blob2str(receptor_blob)
Creates blob of compresser receptor file info
- Parameters:
receptor_blob (blob) – zipped receptor blob
- Returns:
receptor string
- Return type:
str
- static make_receptor_blobs(file_list)
Creates compressed receptor info
- Parameters:
file_list (str) – path to receptor file
- Returns:
compressed receptor
- Return type:
blob
- class ringtail.ResultsManager(docking_mode: str = None, max_poses: int = None, interaction_tolerance: float = None, store_all_poses: bool = None, add_interactions: bool = None, interaction_cutoffs: list = None, max_proc: int = None, storageman: StorageManager = None, storageman_class: StorageManager = None, chunk_size: int = 1, parser_manager: str = 'multiprocess', file_sources=None, string_sources=None)
Bases:
object
Class that handles the processing of the results, including passing on the docking results to the appropriate paralell/multi-processing unit
- Parameters:
max_poses (int) – max number of poses to store for each ligand
interaction_tolerance (float) – Will add the interactions for poses within some tolerance RMSD range of the top pose in a cluster to that top pose.”
store_all_poses (bool) – Store all poses from docking results
add_interactions (bool) – find and save interactions between ligand poses and receptor
interaction_cutoffs (list(float)) – cutoff for interactions of hydrogen bonds and VDW interactions, in ångströms
max_proc (int) – Maximum number of processes to create during parallel file parsing.
storageman (StorageManager) – storageman object
storageman_class (StorageManager) – storagemanager child class/database type
chunk_size (int) – how many tasks ot send to a processor at the time
parser_manager (str, optional) – what paralellization or multiprocessing package to use
file_sources (InputFiles, optional) – given file sources including the receptor file
string_sources (InputStrings, optional) – given string sources including the path to the receptor
- Raises:
- process_docking_data()
Processes docking data in the form of files or strings
- Raises:
ResultsProcessingError – if no file or string sources are provided, or if both are provided
- exception ringtail.ResultsProcessingError
Bases:
Exception
- class ringtail.RingtailCore(db_file: str = 'output.db', storage_type: str = 'sqlite', docking_mode: str = 'dlg', logging_level: str = 'WARNING')
Bases:
object
Core class for coordinating different actions on virtual screening including adding results to storage, filtering and clusteirng, and outputting data as rdkit molecules, plotting docking results, and visualizing select ligands in pymol.
- db_file
name of database file being operated on
- Type:
str
- docking_mode
specifies what docking mode has been used for the results in the database
- Type:
str
- storageman
Interface module with database
- Type:
- resultsman
Module to deal with results processing before adding to database
- Type:
- outputman
Manager for output tasks of log-writting, plotting, ligand SDF writing, starting pymol sessions
- Type:
- _run_mode
refers to whether ringtail is ran from the command line or through direct API use, where the former is more restrictive
- Type:
str
- add_results_from_files(file: str = None, file_path: str = None, file_list: str = None, file_pattern: str = None, recursive: bool = None, receptor_file: str = None, save_receptor: bool = None, filesources_dict: dict = None, duplicate_handling: str = None, overwrite: bool = None, store_all_poses: bool = None, max_poses: int = None, add_interactions: bool = None, interaction_tolerance: float = None, interaction_cutoffs: list = None, max_proc: int = None, options_dict: dict = None, finalize: bool = True)
Call storage manager to process result files and add to database. Creates or adds to an existing a database. Options can be provided as a dict or as individual options. If both are provided, individual options will overwrite those from the dictionary.
- Parameters:
(str (file_list) – list(str)): ligand result file
optional – list(str)): ligand result file
(str – list(str)): list of folders containing one or more result files
optional – list(str)): list of folders containing one or more result files
(str – list(str)): list of ligand result file(s)
optional – list(str)): list of ligand result file(s)
file_pattern (str) – file pattern to use with recursive search in a file_path, “.dlg” for AutoDock-GDP and “.pdbqt” for vina
recursive (bool) – used to recursively search file_path for folders inside folders
receptor_file (str) – string containing the receptor .pdbqt
save_receptor (bool) – whether or not to store the full receptor details in the database (needed for some things)
filesources_dict (dict) – file sources already as an object
duplicate_handling (str, options) – specify how duplicate Results rows should be handled when inserting into database. Options are “ignore” or “replace”. Default behavior will allow duplicate entries.
store_all_poses (bool) – store all ligand poses, does it take precedence over max poses?
max_poses (int) – how many poses to save (ordered by soem score?)
add_interactions (bool) – add ligand-receptor interaction data, only in vina mode
interaction_tolerance (float) – longest ångström distance that is considered interaction?
interaction_cutoffs (list) – ångström distance cutoffs for x and y interaction
max_proc (int) – max number of computer processors to use for file reading
options_dict (dict) – write options as a dict
- Raises:
- add_results_from_vina_string(results_strings: dict = None, receptor_file: str = None, save_receptor: bool = None, resultsources_dict: dict = None, duplicate_handling: str = None, overwrite: bool = None, store_all_poses: bool = None, max_poses: int = None, add_interactions: bool = None, interaction_cutoffs: list = None, max_proc: int = None, options_dict: dict = None, finalize: bool = True)
Call storage manager to process the given vina output string and add to database. Options can be provided as a dict or as individual options. Creates or adds to an existing a database.
- Parameters:
results_string (dict) – string containing the ligand identified and docking results as a dictionary
receptor_file (str) – string containing the receptor .pdbqt
save_receptor (bool) – whether or not to store the full receptor details in the database (needed for some things)
resultsources_dict (dict) – file sources already as an object
duplicate_handling (str, options) – specify how duplicate Results rows should be handled when inserting into database. Options are “ignore” or “replace”. Default behavior will allow duplicate entries.
store_all_poses (bool) – store all ligand poses, does it take precedence over max poses?
max_poses (int) – how many poses to save (ordered by soem score?)
add_interactions (bool) – add ligand-receptor interaction data, only in vina mode
interaction_cutoffs (list) – ångström distance cutoffs for x and y interaction
max_proc (int) – max number of computer processors to use for file reading
options_dict (dict) – write options as a dict
- Raises:
- static default_dict() dict
Creates a dict of all Ringtail options.
- Returns:
json string with options
- Return type:
str
- display_pymol(bookmark_name=None)
Launch pymol session and plot of LE vs docking score. Displays molecules when clicked.
- Parameters:
bookmark_name (str) – bookmark name to use in pymol. ‘None’ uses the whole db?
- property docking_mode
Private method to retrieve docking mode
- Returns:
docking mode
- Return type:
str
- drop_bookmark(bookmark_name: str)
Drops specified bookmark from the database
- Parameters:
bookmark_name (str) – name of bookmark to be dropped.
- export_bookmark_db(bookmark_name: str = None) str
Export database containing data from bookmark
- Parameters:
bookmark_name (str) – name for bookmark_db
- Returns:
name of the new, exported database
- Return type:
str
- export_csv(requested_data: str, csv_name: str, table=False)
Get requested data from database, export as CSV
- Parameters:
requested_data (str) – Table name or SQL-formatted query
csv_name (str) – Name for exported CSV file
table (bool) – flag indicating is requested data is a table name
- export_receptors()
Export receptor in database to pdbqt
- filter(eworst=None, ebest=None, leworst=None, lebest=None, score_percentile=None, le_percentile=None, vdw_interactions=None, hb_interactions=None, reactive_interactions=None, hb_count=None, react_any=None, max_miss=None, ligand_name=None, ligand_operator=None, ligand_substruct=None, ligand_substruct_pos=None, ligand_max_atoms=None, filters_dict: dict | None = None, enumerate_interaction_combs: bool = False, output_all_poses: bool = None, mfpt_cluster: float = None, interaction_cluster: float = None, log_file: str = None, overwrite: bool = None, order_results: str = None, outfields: str = None, bookmark_name: str = None, filter_bookmark: str = None, options_dict: dict | None = None, return_iter=False)
Prepare list of filters, then hand it off to storageman to perform filtering. Creates log of all ligand docking results that passes.
- Parameters:
Filters – eworst (float): specify the worst energy value accepted ebest (float): specify the best energy value accepted leworst (float): specify the worst ligand efficiency value accepted lebest (float): specify the best ligand efficiency value accepted score_percentile (float): specify the worst energy percentile accepted. Express as percentage e.g. 1 for top 1 percent. le_percentile (float): specify the worst ligand efficiency percentile accepted. Express as percentage e.g. 1 for top 1 percent. vdw_interactions (list[tuple]): define van der Waals interactions with residue as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [(‘A:VAL:279:’, True), (‘A:LYS:162:’, True)] -> [(‘chain:resname:resid:atomname’, <wanted (bool)>), (‘chain:resname:resid:atomname’, <wanted (bool)>)] hb_interactions (list[tuple]): define HB (ligand acceptor or donor) interaction as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [(‘A:VAL:279:’, True), (‘A:LYS:162:’, True)] -> [(‘chain:resname:resid:atomname’, <wanted (bool)>), (‘chain:resname:resid:atomname’, <wanted (bool)>)] reactive_interactions (list[tuple]): check if ligand reacted with specified residue as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [(‘A:VAL:279:’, True), (‘A:LYS:162:’, True)] -> [(‘chain:resname:resid:atomname’, <wanted (bool)>), (‘chain:resname:resid:atomname’, <wanted (bool)>)] hb_count (list[tuple]): accept ligands with at least the requested number of HB interactions. If a negative number is provided, then accept ligands with no more than the requested number of interactions. E.g., [(‘hb_count’, 5)] react_any (bool): check if ligand reacted with any residue max_miss (int): Will compute all possible combinations of interaction filters excluding up to max_miss numer of interactions from given set. Default will only return union of poses interaction filter combinations. Use with ‘enumerate_interaction_combs’ for enumeration of poses passing each individual combination of interaction filters. ligand_name (list[str]): specify ligand name(s). Will combine name filters with OR, e.g., [[“lig1”, “lig2”]] ligand_substruct (list[str]): SMARTS, index of atom in SMARTS, cutoff dist, and target XYZ coords, e.g., [[“ccc”, “CN”]] ligand_substruct_pos (list[list[type]]): SMARTS pattern(s) for substructure matching, e.g., [[“[Oh]C”, 0, 1.2, -5.5, 10.0, 15.5]] -> [[“smart_string”, index_of_positioned_atom, cutoff_distance, x, y, z]] ligand_max_atoms (int): Maximum number of heavy atoms a ligand may have ligand_operator (str): logical join operator for multiple SMARTS (default: OR), either AND or OR filters_dict (dict): provide filters as a dictionary
options (Ligand results) –
enumerate_interaction_combs (bool): When used with max_miss > 0, will log ligands/poses passing each separate interaction filter combination as well as union of combinations. Can significantly increase runtime. output_all_poses (bool): By default, will output only top-scoring pose passing filters per ligand. This flag will cause each pose passing the filters to be logged. mfpt_cluster (float): Cluster filtered ligands by Tanimoto distance of Morgan fingerprints with Butina clustering and output ligand with lowest ligand efficiency from each cluster. Default clustering cutoff is 0.5. Useful for selecting chemically dissimilar ligands. interaction_cluster (float): Cluster filtered ligands by Tanimoto distance of interaction fingerprints with Butina clustering and output ligand with lowest ligand efficiency from each cluster. Default clustering cutoff is 0.5. Useful for enhancing selection of ligands with diverse interactions. log_file (str): by default, results are saved in output_log.txt; if this option is used, ligands and requested info passing the filters will be written to specified file overwrite (bool): by default, if a log file exists, it doesn’t get overwritten and an error is returned; this option enable overwriting existing log files. Will also overwrite existing database order_results (str): Stipulates how to order the results when written to the log file. By default will be ordered by order results were added to the database. ONLY TAKES ONE OPTION. Available fields are:
”e” (docking_score), “le” (ligand efficiency), “delta” (delta energy from best pose), “ref_rmsd” (RMSD to reference pose), “e_inter” (intermolecular energy), “e_vdw” (van der waals energy), “e_elec” (electrostatic energy), “e_intra” (intermolecular energy), “n_interact” (number of interactions), “rank” (rank of ligand pose), “run” (run number for ligand pose), “hb” (hydrogen bonds);
outfields (str): defines which fields are used when reporting the results (to stdout and to the log file); fields are specified as comma-separated values, e.g. –outfields=e,le,hb; by default, docking_score (energy) and ligand name are reported; ligand always reported in first column available fields are:
”Ligand_name” (Ligand name), “e” (docking_score), “le” (ligand efficiency), “delta” (delta energy from best pose), “ref_rmsd” (RMSD to reference pose), “e_inter” (intermolecular energy), “e_vdw” (van der waals energy), “e_elec” (electrostatic energy), “e_intra” (intermolecular energy), “n_interact” (number of iteractions), “ligand_smile” , “rank” (rank of ligand pose), “run” (run number for ligand pose), “hb” (hydrogen bonds), “receptor” (receptor name)
bookmark_name (str): name for resulting book mark file. Default value is ‘passing_results’ filter_bookmark (str): name of bookmark to perform filtering over options_dict (dict): write options as a dict return_inter (bool): return an iterable of all of the filtering results
- Returns:
number of ligands passing filter iter (optional): an iterable of all of the filtering results
- Return type:
int
- finalize_write()
Finalize database write by creating interaction tables and setting database version
- find_similar_ligands(query_ligname: str)
Find ligands in cluster with query_ligname
- Parameters:
query_ligname (str) – name of the ligand in the ligand table to look for similars to
- Returns:
number of ligands that are similar
- Return type:
int
- static generate_config_file_template()
Outputs to “config.json in current working directory if to_file = true, else it returns the dict of default option values used for API (for command line a few more options are included that are always used explicitly when using API)
- Parameters:
to_file (bool) – whether to produce the template as a json string or as a file “config.json”
- Returns:
file name of config file or json string with template including default values
- Return type:
str
- get_bookmark_names()
Method to retrieve all bookmark names in a database
- Returns:
of all bookmarks in a database
- Return type:
list
- static get_options_info() dict
Gets names, default values, and meta data for all Ringtail options.
- get_plot_data(bookmark_name: str = None)
Get ligand efficiency and energy for all docking data and for ligands that passed filtering in specified bookmark. Each tuple in the respective lists contains docking_score, leff, pose_id, and ligand name.
- Parameters:
bookmark_name (str)
- Returns:
[all_data], [filtered_data]
- Return type:
list(tuple), list(tuple)
- get_previous_filter_data(outfields=None, bookmark_name=None, log_file=None)
Get data requested in self.out_opts[‘outfields’] from the results bookmark of a previous filtering
- Parameters:
outfields (str) – use outfields as described in RingtailOptions > StorageOptions
bookmark_name (str) – bookmark for which the filters were used
- ligands_rdkit_mol(bookmark_name=None, write_nonpassing=False) dict
Creates a dictionary of RDKit mols of all ligands specified from a bookmark, either excluding (default) or including those ligands that did not pass the filter(s).
- Parameters:
bookmark_name (str, optional) – Option to run over specified bookmark other than that just used for filtering
write_nonpassing (bool, optional) – Option to include non-passing poses for passing ligands
- Returns:
containing ligand names, RDKit mols, flexible residue bols, and other ligand properties
- Return type:
all_mols (dict)
- plot(save=True, bookmark_name: str = None, return_fig_handle: bool = False)
Get data needed for creating Ligand Efficiency vs Energy scatter plot from storageManager. Call OutputManager to create plot.
- Parameters:
save (bool) – whether to save plot to cd
bookmark_name (str) – bookmark from which to fetch filtered data to plot
return_fig_handle (bool) – use to return a handle to the matplotlib figure instead of saving or showing figure
- Returns:
will not show figure if returning figure handle
- Return type:
matplotlib.pyplot.figure (optional)
- produce_summary(columns=['docking_score', 'leff'], percentiles=[1, 10]) None
Print summary of data in storage to sdout
- Parameters:
columns (list(str)) – data columns used to prepare summary
percentiles (list(int)) – cutoff percentiles for the summary
- save_receptor(receptor_file)
Add receptor to database.
- Parameters:
receptor_file (str) – path to receptor file
- set_filters(eworst=None, ebest=None, leworst=None, lebest=None, score_percentile=None, le_percentile=None, vdw_interactions=None, hb_interactions=None, reactive_interactions=None, hb_count=None, react_any=None, max_miss=None, ligand_name=None, ligand_operator=None, ligand_substruct=None, ligand_substruct_pos=None, ligand_max_atoms=None, dict: dict = None)
Create a filter object containing all numerical and string filters.
- Parameters:
eworst (float) – specify the worst energy value accepted
ebest (float) – specify the best energy value accepted
leworst (float) – specify the worst ligand efficiency value accepted
lebest (float) – specify the best ligand efficiency value accepted
score_percentile (float) – specify the worst energy percentile accepted. Express as percentage e.g. 1 for top 1 percent.
le_percentile (float) – specify the worst ligand efficiency percentile accepted. Express as percentage e.g. 1 for top 1 percent.
vdw_interactions (list[tuple]) – define van der Waals interactions with residue as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [(‘A:VAL:279:’, True), (‘A:LYS:162:’, True)] -> [(‘chain:resname:resid:atomname’, <wanted (bool)>), (‘chain:resname:resid:atomname’, <wanted (bool)>)]
hb_interactions (list[tuple]) – define HB (ligand acceptor or donor) interaction as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [(‘A:VAL:279:’, True), (‘A:LYS:162:’, True)] -> [(‘chain:resname:resid:atomname’, <wanted (bool)>), (‘chain:resname:resid:atomname’, <wanted (bool)>)]
reactive_interactions (list[tuple]) – check if ligand reacted with specified residue as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [(‘A:VAL:279:’, True), (‘A:LYS:162:’, True)] -> [(‘chain:resname:resid:atomname’, <wanted (bool)>), (‘chain:resname:resid:atomname’, <wanted (bool)>)]
hb_count (list[tuple]) – accept ligands with at least the requested number of HB interactions. If a negative number is provided, then accept ligands with no more than the requested number of interactions. E.g., [(‘hb_count’, 5)]
react_any (bool) – check if ligand reacted with any residue
max_miss (int) – Will compute all possible combinations of interaction filters excluding up to max_miss numer of interactions from given set. Default will only return union of poses interaction filter combinations. Use with ‘enumerate_interaction_combs’ for enumeration of poses passing each individual combination of interaction filters.
ligand_name (list[str]) – specify ligand name(s). Will combine name filters with OR, e.g., [“lig1”, “lig2”]
ligand_substruct (list[str]) – SMARTS, index of atom in SMARTS, cutoff dist, and target XYZ coords, e.g., [“ccc”, “CN”]
ligand_substruct_pos (list[str]) – SMARTS pattern(s) for substructure matching, e.g., [‘”[Oh]C” 0 1.2 -5.5 10.0 15.5’] -> [“smart_string index_of_positioned_atom cutoff_distance x y z”]
ligand_max_atoms (int) – Maximum number of heavy atoms a ligand may have
ligand_operator (str) – logical join operator for multiple SMARTS (default: OR), either AND or OR
dict (dict) – dictionary of one or more of the above args, is overwritten by individual args
- set_output_options(log_file: str = None, export_sdf_path: str = None, enumerate_interaction_combs: bool = None, dict: dict = None)
Creates output options object that holds attributes related to reading and outputting results. Will assign log_file name and export_sdf_path to the output_manager object.
- Parameters:
log_file (str) – by default, results are saved in “output_log.txt”; if this option is used, ligands and requested info passing the filters will be written to specified file
export_sdf_path (str) – specify the path where to save poses of ligands passing the filters (SDF format); if the directory does not exist, it will be created; if it already exist, it will throw an error, unless the –overwrite is used NOTE: the log file will be automatically saved in this path. Ligands will be stored as SDF files in the order specified.
enumerate_interaction_combs (bool) – When used with max_miss > 0, will log ligands/poses passing each separate interaction filter combination as well as union of combinations. Can significantly increase runtime.
dict (dict) – dictionary of one or more of the above args, is overwritten by individual args
- set_resultsman_attributes(store_all_poses: bool = None, max_poses: int = None, add_interactions: bool = None, interaction_tolerance: float = None, interaction_cutoffs: list = None, max_proc: int = None, dict: dict = None)
Create results_manager_options object if needed, sets options, and assigns them to the results manager object.
- Parameters:
store_all_poses (bool) – store all ligand poses, does it take precedence over max poses?
max_poses (int) – how many poses to save (ordered by soem score?)
add_interactions (bool) – add ligand-receptor interaction data, only in vina mode
interaction_tolerance (float) – longest ångström distance that is considered interaction?
interaction_cutoffs (list) – ångström distance cutoffs for x and y interaction
max_proc (int) – max number of computer processors to use for file reading
dict (dict) – dictionary of one or more of the above args, is overwritten by individual args
- set_storageman_attributes(filter_bookmark: str = None, duplicate_handling: str = None, overwrite: bool = None, order_results: str = None, outfields: str = None, output_all_poses: str = None, mfpt_cluster: float = None, interaction_cluster: float = None, bookmark_name: str = None, dict: dict = None)
Create storage_manager_options object if needed, sets options, and assigns them to the storage manager object.
- Parameters:
filter_bookmark (str) – Perform filtering over specified bookmark. (in output group in CLI)
duplicate_handling (str, options) – specify how duplicate Results rows should be handled when inserting into database. Options are “ignore” or “replace”. Default behavior will allow duplicate entries.
overwrite (bool) – by default, if a log file exists, it doesn’t get overwritten and an error is returned; this option enable overwriting existing log files. Will also overwrite existing database
order_results (str) – Stipulates how to order the results when written to the log file. By default will be ordered by order results were added to the database. ONLY TAKES ONE OPTION.” “available fields are: ” ‘“e” (docking_score), ‘ ‘“le” (ligand efficiency), ‘ ‘“delta” (delta energy from best pose), ‘ ‘“ref_rmsd” (RMSD to reference pose), ‘ ‘“e_inter” (intermolecular energy), ‘ ‘“e_vdw” (van der waals energy), ‘ ‘“e_elec” (electrostatic energy), ‘ ‘“e_intra” (intermolecular energy), ‘ ‘“n_interact” (number of interactions), ‘ ‘“rank” (rank of ligand pose), ‘ ‘“run” (run number for ligand pose), ‘ ‘“hb” (hydrogen bonds); ‘
outfields (str) – defines which fields are used when reporting the results (to stdout and to the log file); fields are specified as comma-separated values, e.g. “–outfields=e,le,hb”; by default, docking_score (energy) and ligand name are reported; ligand always reported in first column available fields are: ‘ ‘“Ligand_name” (Ligand name), ‘ ‘“e” (docking_score), ‘ ‘“le” (ligand efficiency), ‘ ‘“delta” (delta energy from best pose), ‘ ‘“ref_rmsd” (RMSD to reference pose), ‘ ‘“e_inter” (intermolecular energy), ‘ ‘“e_vdw” (van der waals energy), ‘ ‘“e_elec” (electrostatic energy), ‘ ‘“e_intra” (intermolecular energy), ‘ ‘“n_interact” (number of interactions), ‘ ‘“ligand_smile” , ‘ ‘“rank” (rank of ligand pose), ‘ ‘“run” (run number for ligand pose), ‘ ‘“hb” (hydrogen bonds), ‘ ‘“receptor” (receptor name); ‘ “Fields are printed in the order in which they are provided. Ligand name will always be returned and will be added in first position if not specified.
output_all_poses (bool) – By default, will output only top-scoring pose passing filters per ligand. This flag will cause each pose passing the filters to be logged.
mfpt_cluster (float) – Cluster filered ligands by Tanimoto distance of Morgan fingerprints with Butina clustering and output ligand with lowest ligand efficiency from each cluster. Default clustering cutoff is 0.5. Useful for selecting chemically dissimilar ligands.
interaction_cluster (float) – Cluster filered ligands by Tanimoto distance of interaction fingerprints with Butina clustering and output ligand with lowest ligand efficiency from each cluster. Default clustering cutoff is 0.5. Useful for enhancing selection of ligands with diverse interactions.
bookmark_name (str) – name for resulting book mark file. Default value is “passing_results”
dict (dict) – dictionary of one or more of the above args, is overwritten by individual args
- update_database_version(consent=False, new_version='2.0.0')
Method to update database version from earlier versions to either 1.1.0 or 2.0.0
- write_flexres_pdb(receptor_polymer, ligname: str, filename: str, bookmark_name: str = None)
Writes a receptor pdb with flexible residues based on the ligand provided
- Parameters:
receptor_polymer (Polymer) – version of receptor produced by meeko
ligname (str) – ligand name for which the receptor flexible residue info should be collected
filename (str) – name of the output pdb, extension is optional, will default to ‘.pdb’
bookmark_name (str, optional) – will use last used bookmark if not specified, will not work in a db without any filtering performed
- write_molecule_sdfs(sdf_path: str | None = None, all_in_one: bool = True, bookmark_name: str = None, write_nonpassing: bool = None)
Have output manager write molecule sdf files for passing results in given results bookmark
- Parameters:
sdf_path (str, optional) – Optional path existing or to be created in cd where SDF files will be saved
all_in_one (bool, optional) – If True will write all molecules to one SDF (separated by $$$$), if False will write one molecule pre SDF
bookmark_name (str, optional) – Option to run over specified bookmark other than that just used for filtering
write_nonpassing (bool, optional) – Option to include non-passing poses for passing ligands
- Raises:
StorageError – if bookmark or data not found
- exception ringtail.StorageError
Bases:
Exception
- class ringtail.StorageManager
Bases:
object
- check_passing_bookmark_exists(bookmark_name: str | None = None)
Checks if bookmark name is in database
- Parameters:
bookmark_name (str, optional) – name of bookmark name to check if exist, or else will use storageman bookmark_name attribute
- Returns:
indicates if bookmark_name exists in the current database
- Return type:
bool
- check_storage_compatibility()
Checks if chosen storage type has been implemented
- Parameters:
storage_type (str) – name of the storage type
- Raises:
NotImplementedError – raised if seelected storage type has not been implemented
- Returns:
of implemented storage type
- Return type:
class
- close_storage(attached_db=None, vacuum=False)
Close connection to database
- Parameters:
attached_db (str, optional) – name of attached DB (not including file extension)
vacuum (bool, optional) – indicates that database should be vacuumed before closing
- crossref_filter(new_db: str, bookmark1_name: str, bookmark2_name: str, selection_type='-', old_db=None) tuple
Selects ligands found or not found in the given bookmark in both current db and new_db. Stores as temp view
- Parameters:
new_db (str) – file name for database to attach
bookmark1_name (str) – string for name of first bookmark/temp table to compare
bookmark2_name (str) – string for name of second bookmark to compare
selection_type (str) – “+” or “-” indicating if ligand names should (“+”) or should not “-” be in both databases
old_db (str, optional) – file name for previous database
- Returns:
(name of new bookmark (str), number of ligands passing new bookmark (int))
- Return type:
tuple
- field_to_column_name = {'Ligand_name': 'LigName', 'delta': 'deltas', 'e': 'docking_score', 'e_elec': 'energies_electro', 'e_inter': 'energies_inter', 'e_intra': 'energies_intra', 'e_vdw': 'energies_vdw', 'hb': 'num_hb', 'interactions': 'interactions', 'le': 'leff', 'ligand_smile': 'ligand_smile', 'n_interact': 'nr_interactions', 'rank': 'pose_rank', 'receptor': 'receptor', 'ref_rmsd': 'reference_rmsd', 'run': 'run_number'}
- filter_results(all_filters: dict, suppress_output=False) iter
Generate and execute database queries from given filters.
- Parameters:
all_filters (dict) – dict containing all filters. Expects format and keys corresponding to ringtail.Filters().todict()
suppress_output (bool) – prints filtering summary to sdout
- Returns:
iterable, such as an sqlite cursor, of passing results
- Return type:
iter
- finalize_database_write()
Methods to finalize when a database has been written to, and saving the current database schema to the sqlite database.
- get_plot_data(bookmark_name: str = None, only_passing=False)
This function is expected to return an ascii plot representation of the results
- Parameters:
bookmark_name (str) – name of bookmark for which to fetch passing data. Will use default bookmark name if None. Returns empty list if bookmark does not exist.
only_passing (bool) – Only return data for passing ligands. Will return empty list for all data.
- Returns:
cursors as (<all data cursor>, <passing data cursor>)
- Return type:
tuple
- insert_data(results_array, ligands_array, interaction_list, receptor_array=[], insert_receptor=False)
Inserts data from all arrays returned from results manager.
- Parameters:
results_array (list) – list of data to be stored in Results table
ligands_array (list) – list of data to be stored in Ligands table
interaction_list (list) – list of data to be stored in interaction tables
receptor_array (list) – list of data to be stored in Receptors table
insert_receptor (bool, optional) – flag indicating that receptor info should inserted
- insert_interactions(Pose_IDs: list, interactions_list, duplicates)
Takes list of interactions, inserts into database
- Parameters:
Pose_IDs (list(int)) – list of pose ids assigned while writing the current results to database
interactions_list (list) – List of tuples for interactions in form (“type”, “chain”, “residue”, “resid”, “recname”, “recid”)
duplicates (list(Pose_ID)) – any duplicates identified in “insert_results”, if duplicate handling has been specified
- prune()
Deletes rows from results, ligands, and interactions in a bookmark if they do not pass filtering criteria
- class ringtail.StorageManagerSQLite(db_file: str = None, overwrite: bool = None, order_results: str = None, outfields: str = None, filter_bookmark: str = None, output_all_poses: bool = None, mfpt_cluster: float = None, interaction_cluster: float = None, bookmark_name: str = None, duplicate_handling: str = None)
Bases:
StorageManager
SQLite-specific StorageManager subclass
- conn
Connection to database
- Type:
SQLite.conn
- open_cursors
list of cursors that were not closed by the function that created them. Will be closed by close_connection method.
- Type:
list
- db_file
database name
- Type:
str
- overwrite
switch to overwrite database if it exists
- Type:
bool
- order_results
what column name will be used to order results once read
- Type:
str
- outfields
data fields/columns to include when reading and outputting data
- Type:
str
- filter_bookmark
name of bookmark that filtering will be performed over
- Type:
str
- output_all_poses
whether or not to output all poses of a ligand
- Type:
bool
- mfpt_cluster
distance in ångströms to cluster ligands based on morgan fingerprints
- Type:
float
- interaction_cluster
distance in ångströms to cluster ligands based on interactions
- Type:
float
- bookmark_name
name of current bookmark being written to or read from
- Type:
str
- duplicate_handling
optional attribute to deal with insertion of ligands already in the database
- Type:
str
- current_bookmark_name
name of last view to have been written to in the database
- Type:
str
- filtering_window
name of bookmark/view being filtered on
- Type:
str
- index_columns
- Type:
list
- view_suffix
current suffix for views
- Type:
int
- temptable_suffix
current suffix for temporary tables
- Type:
int
- field_to_column_name
Dictionary for converting ringtail options into DB column names
- Type:
dict
- bookmark_has_rows(bookmark_name: str) bool
Method that checks if a given bookmark has any data in it
- Parameters:
bookmark_name (str) – view to check
- Returns:
True if more than zero rows in bookmark
- Return type:
bool
- check_ringtaildb_version()
Checks the database version and confirms whether the code base is compatible with it
- Returns:
whether or not db is compatible with the code base str: current database versions
- Return type:
bool
- check_storage_ready(run_mode: str, docking_mode: str, store_all_poses: bool, max_poses: int)
Check that storage is ready before proceeding, and creates new tables if needed
- Parameters:
run_mode (str) – if ringtail is ran using cmd line interface or api
docking_mode (str) – what docking engine was used to produce results
store_all_poses (bool) – overrwrites max poses
max_poses (int) – max poses to save to db
- Raises:
OptionError – if database options are not compatible
- clone(backup_name=None)
Creates a copy of the db
- Parameters:
backup_name (str, optional) – name of the cloned database
- count_receptors_in_db()
returns number of rows in Receptors table where receptor_object already has blob
- Returns:
number of rows in receptors table str: name of receptor if present in table
- Return type:
int
- Raises:
- create_bookmark(name, query, temp=False, add_poseID=False, filters={})
Takes name and selection query and creates a bookmark of name. Bookmarks are Ringtail specific views that whose information is stored in the ‘Bookmark’ table. #FIXME bug where ligand filter only results are not added as bookmarks
- Parameters:
name (str) – Name for bookmark which will be created
query (str) – SQLite-formated query used to create bookmark
temp (bool, optional) – Flag if bookmark should be temporary
add_poseID (bool, optional) – Add Pose_ID column to bookmark
filters (dict, optional) – a dict of filters used to construct the query
- create_bookmark_from_temp_table(temp_table_name, bookmark_name, original_bookmark_name, wanted_list, unwanted_list=[])
Resaves temp bookmark stored in self.current_bookmark_name as new permenant bookmark
- Parameters:
bookmark_name (str) – name of bookmark to save last temp bookmark as
original_bookmark_name (str) – name of original bookmark
wanted_list (list) – List of wanted database names
unwanted_list (list, optional) – List of unwanted database names
temp_table_name (str) – name of temporary table
- create_temp_table_from_bookmark()
Method that creates a temporary table named “passing_temp”. Please note that this table will be dropped as soon as the database connection closes.
- drop_bookmark(bookmark_name: str)
Drops specified bookmark from database
- Parameters:
bookmark_name (str) – bookmark to be dropped
- Raises:
- fetch_bookmark(bookmark_name: str) Cursor
returns SQLite cursor of all fields in bookmark
- Parameters:
bookmark_name (str) – name of bookmark to retrieve
- Returns:
cursor of requested view
- Return type:
sqlite3.Cursor
- fetch_clustered_similars(ligname: str)
Given ligname, returns poseids for similar poses/ligands from previous clustering. User prompted at runtime to choose cluster.
- Parameters:
ligname (str) – ligname for ligand to find similarity with
- Raises:
ValueError – wrong terminal input
- fetch_data_for_passing_results() iter
Will return SQLite cursor with requested data for outfields for poses that passed filter in self.bookmark_name
- Returns:
sqlite cursor of data from passing data
- Return type:
iter
- Raises:
- fetch_filters_from_bookmark(bookmark_name: str | None = None)
Method that will retrieve filter values used to construct bookmark
- Parameters:
bookmark_name (str, optional) – can get filter values for given bookmark, or filter values from currently active bookmark in storageman
Returns – dict: containing the filter data
- fetch_flexres_info()
fetch flexible residues names and atomname lists
- Returns:
(flexible_residues, flexres_atomnames)
- Return type:
tuple
- fetch_interaction_info_by_index(interaction_idx) tuple
Returns tuple containing interaction info for given interaction_idx
- Parameters:
interaction_idx (int) – interaction index to fetch info for
- Returns:
tuple of info for requested interaction
- Return type:
tuple
- fetch_nonpassing_pose_properties(ligname)
fetch coordinates for poses of ligname which did not pass the filter
- Parameters:
ligname (str) – name of ligand to fetch coordinates for
- Returns:
- SQLite cursor that contains Pose_ID, docking_score, leff, ligand_coordinates,
flexible_res_coordinates, flexible_residues
- Return type:
iter
- fetch_passing_ligand_output_info() iter
fetch information required by vsmanager for writing out molecules
- Returns:
- contains LigName, ligand_smile,
atom_index_map, hydrogen_parents
- Return type:
iter
- fetch_passing_pose_properties(ligname)
fetch coordinates for poses passing filter for given ligand
- Parameters:
ligname (str) – name of ligand to fetch coordinates for
- Returns:
- SQLite cursor that contains Pose_ID, docking_score, leff, ligand_coordinates,
flexible_res_coordinates, flexible_residues
- Return type:
iter
- fetch_pose_interactions(Pose_ID) iter
Fetch all interactions parameters belonging to a Pose_ID
- Parameters:
Pose_ID (int) – pose id, 1-1 with Results table
- Returns:
of interaction information for given Pose_ID
- Return type:
iter
- fetch_receptor_object_by_name(rec_name)
Returns Receptor object from database for given rec_name
- Parameters:
rec_name (str) – Name of receptor to return object for
Returns: str: receptor object as a string
- fetch_receptor_objects()
Returns all Receptor objects from database
- Parameters:
rec_name (str) – Name of receptor to return object for
- Returns:
of receptor names and objects
- Return type:
iter (tuple)
- fetch_single_ligand_output_info(ligname) str
get output information for given ligand
- Parameters:
ligname (str) – ligand name
- Raises:
- Returns:
information containing smiles, atom and index mapping, and hydrogen parents
- Return type:
str
- fetch_single_pose_properties(pose_ID: int) iter
fetch coordinates for pose given by pose_ID
- Parameters:
pose_ID (int) – name of ligand to fetch coordinates for
- Returns:
- SQLite cursor that contains Pose_ID, docking_score, leff, ligand_coordinates,
flexible_res_coordinates, flexible_residues
- Return type:
iter
- fetch_summary_data(columns=['docking_score', 'leff'], percentiles=[1, 10]) dict
- Collect summary data for database:
Num Ligands Num stored poses Num unique interactions
min, max, percentiles for columns in columns
- Parameters:
columns (list (str)) – columns to be displayed and used in summary
percentiles (list(int)) – percentiles to consider
- Returns:
of data summary
- Return type:
dict
- classmethod format_for_storage(ligand_dict: dict) tuple
takes file dictionary from the file parser, formats required storage format
- Parameters:
ligand_dict (dict) – Dictionary containing data from the fileparser
- Returns:
- of lists ([result_row_1, result_row_2,…],
ligand_row, [interaction_tuple_1, interaction_tuple_2, …])
- Return type:
tuple
- get_all_bookmark_names()
Get all bookmarks in sql database as a list of names. Bookmarks are a specific type of sqlite-views whose information is stored in the Bookmarks table.
- Returns:
of bookmark names
- Return type:
list
- get_current_bookmark_name()
returns current bookmark name
- Returns:
name of last passing results bookmark used by database
- Return type:
str
- get_maxmiss_union(total_combinations: int)
Get results that are in union considering max miss
- Parameters:
total_combinations (int) – numer of possible combinations
- Returns:
of passing results
- Return type:
iter
- insert_receptor_blob(receptor, rec_name)
Takes object of Receptor class, updates the column in Receptor table
- Parameters:
receptor (bytes) – bytes receptor object to be inserted into DB
rec_name (string) – Name of receptor. Used to insert into correct row of DB
- Raises:
DatabaseInsertionError – Description
- overwrite_storage()
Will drop all tables in the database.
- set_bookmark_suffix(suffix)
Sets internal bookmark_suffix variable
- Parameters:
suffix (str) – suffix to attached to bookmark-related queries or creation
- to_dataframe(requested_data: str, table=True) pandas.DataFrame
Returns a panda dataframe of table or query given as requested_data
- Parameters:
requested_data (str) – String containing SQL-formatted query or table name
table (bool) – Flag indicating if requested_data is table name or not
- Returns:
dataframe of requested data
- Return type:
pd.DataFrame
- update_database_version(new_version, consent=False)
method that updates sqlite database schema 1.0.0 or 1.1.0 to 1.1.0 or 2.0.0
#NOTE: If you created a version 1 database with the duplicate handling option, there is a chance of inconsistent behavior of anything involving interactions as the Pose_ID was not used as an explicit foreign key in db v1.0.0 and v1.1.0.
- Parameters:
consent (bool, optional) – variable to ensure consent to update database is explicit
- Returns:
bool
- exception ringtail.WriteToStorageError
Bases:
Exception
- class ringtail.Writer(*args: Any, **kwargs: Any)
Bases:
Process
This class is a listener that retrieves data from the queue and writes it into datbase
- process_data(data_packet)
Breaks up the data in the data_packet to distribute between the different arrays to be inserted in the database.
- Parameters:
data_packet (any) – File packet to be processed
- run()
Method overload from parent class. This is where the task of this class is performed. Each multiprocess.Process class must have a “run” method which is called by the initialization (see below) with start()
- Raises:
- write_to_storage()
Inserting data to the database through the designated storagemanager.
- ringtail.parse_single_dlg(fname)
Parse an ADGPU DLG file uncompressed or gzipped
- Parameters:
fname (str) – ligand docking result file name
- Raises:
ValueError –
- Returns:
parsed results ready to be inserted in database
- Return type:
dict
- ringtail.parse_vina_result(data_pointer) dict
Parser for vina docking results, supporting either pdbqt or gzipped (.gz) files, or with the docking results provided as a string.
- Parameters:
data_pointer (any) – either filename or dictionary of string docking results
- Returns:
parsed results ready to be inserted in database
- Return type:
dict