Ringtail package

Submodules

ringtail.cloptionparser module

class ringtail.cloptionparser.CLOptionParser

Bases: object

Command line option/argument parser. Options and switches are utilized in the script ‘rt_process_vs.py’.

process_mode

operating in ‘write’ or ‘read’ mode

Type:: str

rtcore

ringtail core object initialized with the provided db_file

Type:: RingtailCore

filters

fully parsed and organied optional filters

Type:: dict

file_sources

fully parsed docking results and receptor files

Type:: dict

writeopts

fully parsed arguments related to database writing

Type:: dict

storageopts

fully parsed arguments related to how the storage system behaves

Type:: dict

outputopts

fully parsed arguments related to output and reading from the database

Type:: dict

print_summary

switch to print database summary

Type:: bool

filtering

switch to run filtering method

Type:: bool

plot

switch to plot the data

Type:: bool

export_bookmark_db

switch to export bookmark as a new database

Type:: bool

export_receptor

switch to export receptor information to pdbqt

Type:: bool

pymol

switch to visualize ligands in pymol

Type:: bool

data_from_bookmark

switch to write bookmark data to the output log file

Type:: bool

Raises:: OptionError – Error when an option cannot be parsed correctly

process_options(parsed_opts)

Process and organize command line options to into ringtail options and filter dictionaries and ringtail core attributes

Parameters:: parsed_opts (argparse.Namespace) – arguments provided through the cmdline_parser method.

ringtail.cloptionparser.cmdline_parser(defaults: dict = {})

Parses options provided using the command line. All arguments are first populated with default values. If a config file is provided, these will overwrite default values. Any single arguments provided using the argument parser will overwrite default and config file values.

Parameters:: defaults (dict) – default argument values

ringtail.exceptions module

exception ringtail.exceptions.DatabaseConnectionError: Bases: StorageError

exception ringtail.exceptions.DatabaseInsertionError: Bases: StorageError

exception ringtail.exceptions.DatabaseQueryError: Bases: StorageError

exception ringtail.exceptions.DatabaseTableCreationError: Bases: StorageError

exception ringtail.exceptions.DatabaseViewCreationError: Bases: StorageError

exception ringtail.exceptions.FileParsingError: Bases: Exception

exception ringtail.exceptions.MultiprocessingError: Bases: Exception

exception ringtail.exceptions.NoInputError: Bases: OptionError

exception ringtail.exceptions.OptionError: Bases: Exception

exception ringtail.exceptions.OutputError: Bases: Exception

exception ringtail.exceptions.RTCoreError: Bases: Exception

exception ringtail.exceptions.ResultsProcessingError: Bases: Exception

exception ringtail.exceptions.StorageError: Bases: Exception

exception ringtail.exceptions.WriteToStorageError: Bases: Exception

ringtail.interactions module

class ringtail.interactions.InteractionFinder(rec_string, interaction_cutoff_radii)

Bases: object

Class for handling and calculating ligand-receptor interactions.

rec_string

string describing the receptor

Type:: str

interaction_cutoff_radii

cutoff for interactions of hydrogen bonds and VDW interactions, in ångströms

Type:: list(float)

find_pose_interactions(lig_atomtype_list: list, lig_coordinates: list) → dict

Method that identifies interactions for a pose within th given cutoff distances in the main class.

Parameters:

lig_atomtype_list (list) – list of atoms in the ligand
lig_coordinates (list) – coordinates for the atoms in the ligand

Returns:

all interaction details for a given ligand pose

Return type:

dict

ringtail.mpmanager module

class ringtail.mpmanager.MPManager(docking_mode, max_poses, interaction_tolerance, store_all_poses, add_interactions, interaction_cutoffs, max_proc, storageman, storageman_class, chunk_size, target, receptor_file, file_pattern=None, file_sources=None, string_sources=None)

Bases: object

Manager that orchestrates paralell processing of docking results data, using one of the supported multiprocessors.

docking_mode

describes what docking engine was used to produce the results

Type:: str

max_poses

max number of poses to store for each ligand

Type:: int

interaction_tolerance

Will add the interactions for poses within some tolerance RMSD range of the top pose in a cluster to that top pose.”

Type:: float

store_all_poses

Store all poses from docking results

Type:: bool

add_interactions

find and save interactions between ligand poses and receptor

Type:: bool

interaction_cutoffs

cutoff for interactions of hydrogen bonds and VDW interactions, in ångströms

Type:: list(float)

max_proc

Maximum number of processes to create during parallel file parsing.

Type:: int

storageman

storageman object

Type:: StorageManager

storageman_class

storagemanager child class/database type

Type:: StorageManager

chunk_size

how many tasks ot send to a processor at the time

Type:: int

target

name of receptor

Type:: str

receptor_file

file path to receptor

Type:: str

file_pattern

file pattern to look for if recursively finding results files to process

Type:: str, optional

file_sources

RingtailOption object that holds all attributes related to results files

Type:: InputFiles, optional

string_sources

RingtailOption object that holds all attributes related to results strings

Type:: InputStrings, optional

num_files

number of files processed at any given time

Type:: int

process_results(): Processes results data (files or string sources) by adding them to the queue and starting their processing in multiprocess.

ringtail.mpreaderwriter module

class ringtail.mpreaderwriter.DockingFileReader(*args: Any, **kwargs: Any)

Bases: Process

This class is the individual worker for processing docking results. One instance of this class is instantiated for each available processor.

queueIn

current queue for the processor/file reader

Type:: multiprocess.Queue

queueOut

queue for the processor/file reader after adding or removing an item

Type:: multiprocess.Queue

pipe_conn

pipe connection to the reader

Type:: multiprocess.Pipe

storageman

storageman object

Type:: StorageManager

storageman_class

storagemanager child class/database type

Type:: StorageManager

docking_mode

describes what docking engine was used to produce the results

Type:: str

max_poses

max number of poses to store for each ligand

Type:: int

interaction_tolerance

Will add the interactions for poses within some tolerance RMSD range of the top pose in a cluster to that top pose.”

Type:: float

store_all_poses

Store all poses from docking results

Type:: bool

add_interactions

find and save interactions between ligand poses and receptor

Type:: bool

interaction_cutoffs

cutoff for interactions of hydrogen bonds and VDW interactions, in ångströms

Type:: list(float)

target

receptor name

Type:: str

run()

Method overload from parent class .This is where the task of this class is performed. Each multiprocess.Process class must have a “run” method which is called by the initialization (see below) with start()

Raises:

NotImplementedError – if parser for specific docking result type is not implemented
FileParsingError –

class ringtail.mpreaderwriter.Writer(*args: Any, **kwargs: Any)

Bases: Process

This class is a listener that retrieves data from the queue and writes it into datbase

process_data(data_packet)

Breaks up the data in the data_packet to distribute between the different arrays to be inserted in the database.

Parameters:: data_packet (any) – File packet to be processed

run()

Method overload from parent class. This is where the task of this class is performed. Each multiprocess.Process class must have a “run” method which is called by the initialization (see below) with start()

Raises:: WriteToStorageError –

write_to_storage(): Inserting data to the database through the designated storagemanager.

ringtail.outputmanager module

class ringtail.outputmanager.OutputManager(log_file=None, export_sdf_path=None)

Bases: object

Class for creating outputs, can be a context manager to handle log files

log_file

name for log file

Type:: str

export_sdf_path

path for exporting SDF molecule files

Type:: str

_log_open

if log file is open or not

Type:: bool

close_logfile(): Closes the log file properly and reset file pointer to filename

log_num_passing_ligands(number_passing_ligands: int)

Write the number of ligands which pass given filter to log file

Parameters:: number_passing_ligands (int) – number of ligands that passed filter
Raises:: OutputError –

open_logfile(write_filters_header=True)

Opens log file and creates it if needed

Parameters:: write_filters_header (bool) – only used because one method does not take the same headers
Raises:: OutputError –

plot_all_data(xdata, ydata, num_of_bins: int = 100)

Takes dictionary of binned data where key is the coordinates of the bin and value is the number of points in that bin. Adds to scatter plot colored by value

Parameters:

xdata (list) – list of x axis data (needs to be same length as ydata)
ydata (list) – list of y axis data (needs to be same length as xdata)
num_of_bins (int) – number of bins to organize data in

Returns:

matplotlib.pyplot.figure

Raises:

OutputError –

plot_single_points(x: list, y: list, markersize: int = 20, color='crimson')

Add points to scatter plot with given x and y coordinates and color.

Parameters:

x (float) – x coordinate
y (float) – y coordinate
color (str, optional) – Color for point. Default black.

Raises:

OutputError –

save_scatterplot()

Saves and closes current figure as scatter.png

Raises:: OutputError –

scatter_hist(x, y, z, ax_histx, ax_histy)

Makes scatterplot with a histogram on each axis

Parameters:

x (list) – x coordinates for data
y (list) – y coordinates for data
z (list) – z coordinates for data
ax (matplotlib.axis) – scatterplot axis
ax_histx (matplotlib.axis) – x histogram axis
ax_histy (matplotlib.axis) – y histogram axis

Raises:

OutputError –

write_filter_log(lines)

Writes lines from results iterable into log file

Parameters:: lines (iterable) – Iterable with tuples of data for writing into log
Raises:: OutputError –
Returns:: number of ligands passing that are written to log file
Return type:: int

write_filters_to_log(filters_dict, included_interactions, additional_info='')

Takes dictionary of filters, formats as string and writes to log file

Parameters:

filters_dict (dict) – dictionary with filtering options
included_interactions (list) – types of interactions to include in the filtering
additional_info (str) – any additional information to write to top of log file

Raises:

OutputError –

write_find_similar_header(query_ligname, cluster_name): Properly formats header for the log file find_similar_ligands

write_maxmiss_union_header(): Properly formats header for the log file if using max_miss and enumerate_interaction_combs

write_out_mol(filename, mol, flexres_mols, properties)

Writes out given mol as sdf. Will create the specified sdf folder in current working directory if needed.

Parameters:

filename (str) – name of SDF file that will be written to
mol (RDKit.Chem.Mol) – RDKit molobject to be written to SDF
flexres_mols (list) – dictionary of rdkit molecules for flexible residues
properties (dict) – dictionary of list of properties to add to mol before writing

Raises:

OutputError –

write_receptor_pdbqt(recname: str, receptor_compbytes)

Writes a pdbqt file from receptor “blob”

Parameters:

recname (str) – name of receptor to use in output filename
receptor_compbytes (blob) – receptor blob

write_results_bookmark_to_log(bookmark_name)

Write the name of the result bookmark into log

Parameters:: bookmark_name (str) – name of current results’ bookmark in db
Raises:: OutputError –

ringtail.parsers module

ringtail.parsers.parse_single_dlg(fname)

Parse an ADGPU DLG file uncompressed or gzipped

Parameters:

fname (str) – ligand docking result file name

Raises:

ValueError –
FileParsingError –

Returns:

parsed results ready to be inserted in database

Return type:

dict

ringtail.parsers.parse_vina_result(data_pointer) → dict

Parser for vina docking results, supporting either pdbqt or gzipped (.gz) files, or with the docking results provided as a string.

Parameters:: data_pointer (any) – either filename or dictionary of string docking results
Returns:: parsed results ready to be inserted in database
Return type:: dict

ringtail.parsers.receptor_pdbqt_parser(fname)

Parse receptor PDBQT file into list of dictionary with dictionary containing data for a single atom line

Parameters:: fname (string) – name of receptor pdbqt file to parse

ringtail.receptormanager module

class ringtail.receptormanager.ReceptorManager

Bases: object

Class with methods dealing with formatting of receptor information

static blob2str(receptor_blob)

Creates blob of compresser receptor file info

Parameters:: receptor_blob (blob) – zipped receptor blob
Returns:: receptor string
Return type:: str

static make_receptor_blobs(file_list)

Creates compressed receptor info

Parameters:: file_list (str) – path to receptor file
Returns:: compressed receptor
Return type:: blob

ringtail.resultsmanager module

class ringtail.resultsmanager.ResultsManager(docking_mode: str = None, max_poses: int = None, interaction_tolerance: float = None, store_all_poses: bool = None, add_interactions: bool = None, interaction_cutoffs: list = None, max_proc: int = None, storageman: StorageManager = None, storageman_class: StorageManager = None, chunk_size: int = 1, parser_manager: str = 'multiprocess', file_sources=None, string_sources=None)

Bases: object

Class that handles the processing of the results, including passing on the docking results to the appropriate paralell/multi-processing unit

Parameters:

max_poses (int) – max number of poses to store for each ligand
interaction_tolerance (float) – Will add the interactions for poses within some tolerance RMSD range of the top pose in a cluster to that top pose.”
store_all_poses (bool) – Store all poses from docking results
add_interactions (bool) – find and save interactions between ligand poses and receptor
interaction_cutoffs (list(float)) – cutoff for interactions of hydrogen bonds and VDW interactions, in ångströms
max_proc (int) – Maximum number of processes to create during parallel file parsing.
storageman (StorageManager) – storageman object
storageman_class (StorageManager) – storagemanager child class/database type
chunk_size (int) – how many tasks ot send to a processor at the time
parser_manager (str, optional) – what paralellization or multiprocessing package to use
file_sources (InputFiles, optional) – given file sources including the receptor file
string_sources (InputStrings, optional) – given string sources including the path to the receptor

Raises:

ResultsProcessingError –

process_docking_data()

Processes docking data in the form of files or strings

Raises:: ResultsProcessingError – if no file or string sources are provided, or if both are provided

ringtail.ringtailcore module

class ringtail.ringtailcore.RingtailCore(db_file: str = 'output.db', storage_type: str = 'sqlite', docking_mode: str = 'dlg', logging_level: str = 'WARNING')

Bases: object

Core class for coordinating different actions on virtual screening including adding results to storage, filtering and clusteirng, and outputting data as rdkit molecules, plotting docking results, and visualizing select ligands in pymol.

db_file

name of database file being operated on

Type:: str

docking_mode

specifies what docking mode has been used for the results in the database

Type:: str

storageman

Interface module with database

Type:: StorageManager

resultsman

Module to deal with results processing before adding to database

Type:: ResultsManager

outputman

Manager for output tasks of log-writting, plotting, ligand SDF writing, starting pymol sessions

Type:: OutputManager

filters

object holding all optional filters

Type:: Filters

_run_mode

refers to whether ringtail is ran from the command line or through direct API use, where the former is more restrictive

Type:: str

add_results_from_files(file: str = None, file_path: str = None, file_list: str = None, file_pattern: str = None, recursive: bool = None, receptor_file: str = None, save_receptor: bool = None, filesources_dict: dict = None, duplicate_handling: str = None, overwrite: bool = None, store_all_poses: bool = None, max_poses: int = None, add_interactions: bool = None, interaction_tolerance: float = None, interaction_cutoffs: list = None, max_proc: int = None, options_dict: dict = None, finalize: bool = True)

Call storage manager to process result files and add to database. Creates or adds to an existing a database. Options can be provided as a dict or as individual options. If both are provided, individual options will overwrite those from the dictionary.

Parameters:

(str (file_list) – list(str)): ligand result file
optional – list(str)): ligand result file
(str – list(str)): list of folders containing one or more result files
optional – list(str)): list of folders containing one or more result files
(str – list(str)): list of ligand result file(s)
optional – list(str)): list of ligand result file(s)
file_pattern (str) – file pattern to use with recursive search in a file_path, “.dlg” for AutoDock-GDP and “.pdbqt” for vina
recursive (bool) – used to recursively search file_path for folders inside folders
receptor_file (str) – string containing the receptor .pdbqt
save_receptor (bool) – whether or not to store the full receptor details in the database (needed for some things)
filesources_dict (dict) – file sources already as an object
duplicate_handling (str, options) – specify how duplicate Results rows should be handled when inserting into database. Options are “ignore” or “replace”. Default behavior will allow duplicate entries.
store_all_poses (bool) – store all ligand poses, does it take precedence over max poses?
max_poses (int) – how many poses to save (ordered by soem score?)
add_interactions (bool) – add ligand-receptor interaction data, only in vina mode
interaction_tolerance (float) – longest ångström distance that is considered interaction?
interaction_cutoffs (list) – ångström distance cutoffs for x and y interaction
max_proc (int) – max number of computer processors to use for file reading
options_dict (dict) – write options as a dict

Raises:

OptionError –

add_results_from_vina_string(results_strings: dict = None, receptor_file: str = None, save_receptor: bool = None, resultsources_dict: dict = None, duplicate_handling: str = None, overwrite: bool = None, store_all_poses: bool = None, max_poses: int = None, add_interactions: bool = None, interaction_cutoffs: list = None, max_proc: int = None, options_dict: dict = None, finalize: bool = True)

Call storage manager to process the given vina output string and add to database. Options can be provided as a dict or as individual options. Creates or adds to an existing a database.

Parameters:

results_string (dict) – string containing the ligand identified and docking results as a dictionary
receptor_file (str) – string containing the receptor .pdbqt
save_receptor (bool) – whether or not to store the full receptor details in the database (needed for some things)
resultsources_dict (dict) – file sources already as an object
duplicate_handling (str, options) – specify how duplicate Results rows should be handled when inserting into database. Options are “ignore” or “replace”. Default behavior will allow duplicate entries.
store_all_poses (bool) – store all ligand poses, does it take precedence over max poses?
max_poses (int) – how many poses to save (ordered by soem score?)
add_interactions (bool) – add ligand-receptor interaction data, only in vina mode
interaction_cutoffs (list) – ångström distance cutoffs for x and y interaction
max_proc (int) – max number of computer processors to use for file reading
options_dict (dict) – write options as a dict

Raises:

OptionError –

static default_dict() → dict

Creates a dict of all Ringtail options.

Returns:: json string with options
Return type:: str

display_pymol(bookmark_name=None)

Launch pymol session and plot of LE vs docking score. Displays molecules when clicked.

Parameters:: bookmark_name (str) – bookmark name to use in pymol. Will look for the default bookmark ‘passing_results’ (or last used bookmark) if None is provided.

property docking_mode

Private method to retrieve docking mode

Returns:: docking mode
Return type:: str

drop_bookmark(bookmark_name: str)

Drops specified bookmark from the database

Parameters:: bookmark_name (str) – name of bookmark to be dropped.

export_bookmark_db(bookmark_name: str = None) → str

Export database containing data from bookmark

Parameters:: bookmark_name (str) – name for bookmark_db
Returns:: name of the new, exported database
Return type:: str

export_csv(requested_data: str, csv_name: str, table=False)

Get requested data from database, export as CSV

Parameters:

requested_data (str) – Table name or SQL-formatted query
csv_name (str) – Name for exported CSV file
table (bool) – flag indicating is requested data is a table name

export_receptors(): Export receptor in database to pdbqt

filter(eworst=None, ebest=None, leworst=None, lebest=None, score_percentile=None, le_percentile=None, vdw_interactions=None, hb_interactions=None, reactive_interactions=None, hb_count=None, react_any=None, max_miss=None, ligand_name=None, ligand_operator=None, ligand_substruct=None, ligand_substruct_pos=None, ligand_max_atoms=None, filters_dict: dict = None, enumerate_interaction_combs: bool = False, output_all_poses: bool = None, mfpt_cluster: float = None, interaction_cluster: float = None, log_file: str = None, overwrite: bool = None, order_results: str = None, outfields: str = None, bookmark_name: str = None, filter_bookmark: str = None, options_dict: dict = None, return_iter=False)

Prepare list of filters, then hand it off to storageman to perform filtering. Creates log of all ligand docking results that passes.

Parameters:

Filters – eworst (float): specify the worst energy value accepted ebest (float): specify the best energy value accepted leworst (float): specify the worst ligand efficiency value accepted lebest (float): specify the best ligand efficiency value accepted score_percentile (float): specify the worst energy percentile accepted. Express as percentage e.g. 1 for top 1 percent. le_percentile (float): specify the worst ligand efficiency percentile accepted. Express as percentage e.g. 1 for top 1 percent. vdw_interactions (list[tuple]): define van der Waals interactions with residue as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [(‘A:VAL:279:’, True), (‘A:LYS:162:’, True)] -> [(‘chain:resname:resid:atomname’, <wanted (bool)>), (‘chain:resname:resid:atomname’, <wanted (bool)>)] hb_interactions (list[tuple]): define HB (ligand acceptor or donor) interaction as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [(‘A:VAL:279:’, True), (‘A:LYS:162:’, True)] -> [(‘chain:resname:resid:atomname’, <wanted (bool)>), (‘chain:resname:resid:atomname’, <wanted (bool)>)] reactive_interactions (list[tuple]): check if ligand reacted with specified residue as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [(‘A:VAL:279:’, True), (‘A:LYS:162:’, True)] -> [(‘chain:resname:resid:atomname’, <wanted (bool)>), (‘chain:resname:resid:atomname’, <wanted (bool)>)] hb_count (list[tuple]): accept ligands with at least the requested number of HB interactions. If a negative number is provided, then accept ligands with no more than the requested number of interactions. E.g., [(‘hb_count’, 5)] react_any (bool): check if ligand reacted with any residue max_miss (int): Will compute all possible combinations of interaction filters excluding up to max_miss numer of interactions from given set. Default will only return union of poses interaction filter combinations. Use with ‘enumerate_interaction_combs’ for enumeration of poses passing each individual combination of interaction filters. ligand_name (list[str]): specify ligand name(s). Will combine name filters with OR, e.g., [[“lig1”, “lig2”]] ligand_substruct (list[str]): SMARTS, index of atom in SMARTS, cutoff dist, and target XYZ coords, e.g., [[“ccc”, “CN”]] ligand_substruct_pos (list[list[type]]): SMARTS pattern(s) for substructure matching, e.g., [[“[Oh]C”, 0, 1.2, -5.5, 10.0, 15.5]] -> [[“smart_string”, index_of_positioned_atom, cutoff_distance, x, y, z]] ligand_max_atoms (int): Maximum number of heavy atoms a ligand may have ligand_operator (str): logical join operator for multiple SMARTS (default: OR), either AND or OR filters_dict (dict): provide filters as a dictionary
options (Ligand results) –
enumerate_interaction_combs (bool): When used with max_miss > 0, will log ligands/poses passing each separate interaction filter combination as well as union of combinations. Can significantly increase runtime. output_all_poses (bool): By default, will output only top-scoring pose passing filters per ligand. This flag will cause each pose passing the filters to be logged. mfpt_cluster (float): Cluster filtered ligands by Tanimoto distance of Morgan fingerprints with Butina clustering and output ligand with lowest ligand efficiency from each cluster. Default clustering cutoff is 0.5. Useful for selecting chemically dissimilar ligands. interaction_cluster (float): Cluster filtered ligands by Tanimoto distance of interaction fingerprints with Butina clustering and output ligand with lowest ligand efficiency from each cluster. Default clustering cutoff is 0.5. Useful for enhancing selection of ligands with diverse interactions. log_file (str): by default, results are saved in output_log.txt; if this option is used, ligands and requested info passing the filters will be written to specified file overwrite (bool): by default, if a log file exists, it doesn’t get overwritten and an error is returned; this option enable overwriting existing log files. Will also overwrite existing database order_results (str): Stipulates how to order the results when written to the log file. By default will be ordered by order results were added to the database. ONLY TAKES ONE OPTION. Available fields are:

”e” (docking_score), “le” (ligand efficiency), “delta” (delta energy from best pose), “ref_rmsd” (RMSD to reference pose), “e_inter” (intermolecular energy), “e_vdw” (van der waals energy), “e_elec” (electrostatic energy), “e_intra” (intermolecular energy), “n_interact” (number of interactions), “rank” (rank of ligand pose), “run” (run number for ligand pose), “hb” (hydrogen bonds);

outfields (str): defines which fields are used when reporting the results (to stdout and to the log file); fields are specified as comma-separated values, e.g. –outfields=e,le,hb; by default, docking_score (energy) and ligand name are reported; ligand always reported in first column available fields are:

”Ligand_name” (Ligand name), “e” (docking_score), “le” (ligand efficiency), “delta” (delta energy from best pose), “ref_rmsd” (RMSD to reference pose), “e_inter” (intermolecular energy), “e_vdw” (van der waals energy), “e_elec” (electrostatic energy), “e_intra” (intermolecular energy), “n_interact” (number of iteractions), “ligand_smile” , “rank” (rank of ligand pose), “run” (run number for ligand pose), “hb” (hydrogen bonds), “receptor” (receptor name)

bookmark_name (str): name for resulting book mark file. Default value is ‘passing_results’ filter_bookmark (str): name of bookmark to perform filtering over options_dict (dict): write options as a dict return_inter (bool): return an iterable of all of the filtering results

Returns:

number of ligands passing filter iter (optional): an iterable of all of the filtering results

Return type:

int

finalize_write(): Finalize database write by creating interaction tables and setting database version

find_similar_ligands(query_ligname: str)

Find ligands in cluster with query_ligname

Parameters:: query_ligname (str) – name of the ligand in the ligand table to look for similars to
Returns:: number of ligands that are similar
Return type:: int

static generate_config_file_template()

Outputs to “config.json in current working directory if to_file = true, else it returns the dict of default option values used for API (for command line a few more options are included that are always used explicitly when using API)

Parameters:: to_file (bool) – whether to produce the template as a json string or as a file “config.json”
Returns:: file name of config file or json string with template including default values
Return type:: str

get_bookmark_names()

Method to retrieve all bookmark names in a database

Returns:: of all bookmarks in a database
Return type:: list

static get_options_info() → dict: Gets names, default values, and meta data for all Ringtail options.

get_plot_data(bookmark_name: str = None)

Get ligand efficiency and energy for all docking data and for ligands that passed filtering in specified bookmark. Each tuple in the respective lists contains docking_score, leff, pose_id, and ligand name.

Parameters:: bookmark_name (str)
Returns:: [all_data], [filtered_data]
Return type:: list(tuple), list(tuple)

get_previous_filter_data(outfields=None, bookmark_name=None, log_file=None)

Get data requested in self.out_opts[‘outfields’] from the results bookmark of a previous filtering

Parameters:

outfields (str) – use outfields as described in RingtailOptions > StorageOptions
bookmark_name (str) – bookmark for which the filters were used

ligands_rdkit_mol(bookmark_name=None, write_nonpassing=False) → dict

Creates a dictionary of RDKit mols of all ligands specified from a bookmark, either excluding (default) or including those ligands that did not pass the filter(s).

Parameters:

bookmark_name (str, optional) – Option to run over specified bookmark other than that just used for filtering
write_nonpassing (bool, optional) – Option to include non-passing poses for passing ligands

Returns:

containing ligand names, RDKit mols, flexible residue bols, and other ligand properties

Return type:

all_mols (dict)

plot(save=True, bookmark_name: str = None, return_fig_handle: bool = False)

Get data needed for creating Ligand Efficiency vs Energy scatter plot from storageManager. Call OutputManager to create plot. Option to save the plot and close it immediately, or keep it open and save it manually later.

Parameters:

save (bool) – whether to save plot to cd. Will save and close figure
bookmark_name (str) – bookmark from which to fetch filtered data to plot
return_fig_handle (bool) – use to return a handle to the matplotlib figure instead of saving or showing figure

Returns:

will not show figure if returning figure handle

Return type:

matplotlib.pyplot.figure (optional)

produce_summary(columns=['docking_score', 'leff'], percentiles=[1, 10]) → None

Print summary of data in storage to sdout

Parameters:

columns (list(str)) – data columns used to prepare summary
percentiles (list(int)) – cutoff percentiles for the summary

save_receptor(receptor_file)

Add receptor to database.

Parameters:: receptor_file (str) – path to receptor file

set_filters(eworst=None, ebest=None, leworst=None, lebest=None, score_percentile=None, le_percentile=None, vdw_interactions=None, hb_interactions=None, reactive_interactions=None, hb_count=None, react_any=None, max_miss=None, ligand_name=None, ligand_operator=None, ligand_substruct=None, ligand_substruct_pos=None, ligand_max_atoms=None, dict: dict = None)

Create a filter object containing all numerical and string filters.

Parameters:

eworst (float) – specify the worst energy value accepted
ebest (float) – specify the best energy value accepted
leworst (float) – specify the worst ligand efficiency value accepted
lebest (float) – specify the best ligand efficiency value accepted
score_percentile (float) – specify the worst energy percentile accepted. Express as percentage e.g. 1 for top 1 percent.
le_percentile (float) – specify the worst ligand efficiency percentile accepted. Express as percentage e.g. 1 for top 1 percent.
vdw_interactions (list[tuple]) – define van der Waals interactions with residue as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [(‘A:VAL:279:’, True), (‘A:LYS:162:’, True)] -> [(‘chain:resname:resid:atomname’, <wanted (bool)>), (‘chain:resname:resid:atomname’, <wanted (bool)>)]
hb_interactions (list[tuple]) – define HB (ligand acceptor or donor) interaction as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [(‘A:VAL:279:’, True), (‘A:LYS:162:’, True)] -> [(‘chain:resname:resid:atomname’, <wanted (bool)>), (‘chain:resname:resid:atomname’, <wanted (bool)>)]
reactive_interactions (list[tuple]) – check if ligand reacted with specified residue as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [(‘A:VAL:279:’, True), (‘A:LYS:162:’, True)] -> [(‘chain:resname:resid:atomname’, <wanted (bool)>), (‘chain:resname:resid:atomname’, <wanted (bool)>)]
hb_count (list[tuple]) – accept ligands with at least the requested number of HB interactions. If a negative number is provided, then accept ligands with no more than the requested number of interactions. E.g., [(‘hb_count’, 5)]
react_any (bool) – check if ligand reacted with any residue
max_miss (int) – Will compute all possible combinations of interaction filters excluding up to max_miss numer of interactions from given set. Default will only return union of poses interaction filter combinations. Use with ‘enumerate_interaction_combs’ for enumeration of poses passing each individual combination of interaction filters.
ligand_name (list[str]) – specify ligand name(s). Will combine name filters with OR, e.g., [“lig1”, “lig2”]
ligand_substruct (list[str]) – SMARTS, index of atom in SMARTS, cutoff dist, and target XYZ coords, e.g., [“ccc”, “CN”]
ligand_substruct_pos (list[str]) – SMARTS pattern(s) for substructure matching, e.g., [‘”[Oh]C” 0 1.2 -5.5 10.0 15.5’] -> [“smart_string index_of_positioned_atom cutoff_distance x y z”]
ligand_max_atoms (int) – Maximum number of heavy atoms a ligand may have
ligand_operator (str) – logical join operator for multiple SMARTS (default: OR), either AND or OR
dict (dict) – dictionary of one or more of the above args, is overwritten by individual args

set_output_options(log_file: str = None, export_sdf_path: str = None, enumerate_interaction_combs: bool = None, dict: dict = None)

Creates output options object that holds attributes related to reading and outputting results. Will assign log_file name and export_sdf_path to the output_manager object.

Parameters:

log_file (str) – by default, results are saved in “output_log.txt”; if this option is used, ligands and requested info passing the filters will be written to specified file
export_sdf_path (str) – specify the path where to save poses of ligands passing the filters (SDF format); if the directory does not exist, it will be created; if it already exist, it will throw an error, unless the –overwrite is used NOTE: the log file will be automatically saved in this path. Ligands will be stored as SDF files in the order specified.
enumerate_interaction_combs (bool) – When used with max_miss > 0, will log ligands/poses passing each separate interaction filter combination as well as union of combinations. Can significantly increase runtime.
dict (dict) – dictionary of one or more of the above args, is overwritten by individual args

set_resultsman_attributes(store_all_poses: bool = None, max_poses: int = None, add_interactions: bool = None, interaction_tolerance: float = None, interaction_cutoffs: list = None, max_proc: int = None, dict: dict = None)

Create results_manager_options object if needed, sets options, and assigns them to the results manager object.

Parameters:

store_all_poses (bool) – store all ligand poses, does it take precedence over max poses?
max_poses (int) – how many poses to save (ordered by soem score?)
add_interactions (bool) – add ligand-receptor interaction data, only in vina mode
interaction_tolerance (float) – longest ångström distance that is considered interaction?
interaction_cutoffs (list) – ångström distance cutoffs for x and y interaction
max_proc (int) – max number of computer processors to use for file reading
dict (dict) – dictionary of one or more of the above args, is overwritten by individual args

set_storageman_attributes(filter_bookmark: str = None, duplicate_handling: str = None, overwrite: bool = None, order_results: str = None, outfields: str = None, output_all_poses: str = None, mfpt_cluster: float = None, interaction_cluster: float = None, bookmark_name: str = None, dict: dict = None)

Create storage_manager_options object if needed, sets options, and assigns them to the storage manager object.

Parameters:

filter_bookmark (str) – Perform filtering over specified bookmark. (in output group in CLI)
duplicate_handling (str, options) – specify how duplicate Results rows should be handled when inserting into database. Options are “ignore” or “replace”. Default behavior will allow duplicate entries.
overwrite (bool) – by default, if a log file exists, it doesn’t get overwritten and an error is returned; this option enable overwriting existing log files. Will also overwrite existing database
order_results (str) – Stipulates how to order the results when written to the log file. By default will be ordered by order results were added to the database. ONLY TAKES ONE OPTION.” “available fields are: ” ‘“e” (docking_score), ‘ ‘“le” (ligand efficiency), ‘ ‘“delta” (delta energy from best pose), ‘ ‘“ref_rmsd” (RMSD to reference pose), ‘ ‘“e_inter” (intermolecular energy), ‘ ‘“e_vdw” (van der waals energy), ‘ ‘“e_elec” (electrostatic energy), ‘ ‘“e_intra” (intermolecular energy), ‘ ‘“n_interact” (number of interactions), ‘ ‘“rank” (rank of ligand pose), ‘ ‘“run” (run number for ligand pose), ‘ ‘“hb” (hydrogen bonds); ‘
outfields (str) – defines which fields are used when reporting the results (to stdout and to the log file); fields are specified as comma-separated values, e.g. “–outfields=e,le,hb”; by default, docking_score (energy) and ligand name are reported; ligand always reported in first column available fields are: ‘ ‘“Ligand_name” (Ligand name), ‘ ‘“e” (docking_score), ‘ ‘“le” (ligand efficiency), ‘ ‘“delta” (delta energy from best pose), ‘ ‘“ref_rmsd” (RMSD to reference pose), ‘ ‘“e_inter” (intermolecular energy), ‘ ‘“e_vdw” (van der waals energy), ‘ ‘“e_elec” (electrostatic energy), ‘ ‘“e_intra” (intermolecular energy), ‘ ‘“n_interact” (number of interactions), ‘ ‘“ligand_smile” , ‘ ‘“rank” (rank of ligand pose), ‘ ‘“run” (run number for ligand pose), ‘ ‘“hb” (hydrogen bonds), ‘ ‘“receptor” (receptor name); ‘ “Fields are printed in the order in which they are provided. Ligand name will always be returned and will be added in first position if not specified.
output_all_poses (bool) – By default, will output only top-scoring pose passing filters per ligand. This flag will cause each pose passing the filters to be logged.
mfpt_cluster (float) – Cluster filered ligands by Tanimoto distance of Morgan fingerprints with Butina clustering and output ligand with lowest ligand efficiency from each cluster. Default clustering cutoff is 0.5. Useful for selecting chemically dissimilar ligands.
interaction_cluster (float) – Cluster filered ligands by Tanimoto distance of interaction fingerprints with Butina clustering and output ligand with lowest ligand efficiency from each cluster. Default clustering cutoff is 0.5. Useful for enhancing selection of ligands with diverse interactions.
bookmark_name (str) – name for resulting book mark file. Default value is “passing_results”
dict (dict) – dictionary of one or more of the above args, is overwritten by individual args

update_database_version(consent=False, new_version='2.0.0'): Method to update database version from earlier versions to either 1.1.0 or 2.0.0

write_flexres_pdb(receptor_polymer, ligname: str, filename: str, bookmark_name: str = None)

Writes a receptor pdb with flexible residues based on the ligand provided

Parameters:

receptor_polymer (Polymer) – version of receptor produced by meeko
ligname (str) – ligand name for which the receptor flexible residue info should be collected
filename (str) – name of the output pdb, extension is optional, will default to ‘.pdb’
bookmark_name (str, optional) – will use last used bookmark if not specified, will not work in a db without any filtering performed

write_molecule_sdfs(sdf_path: str = None, all_in_one: bool = True, bookmark_name: str = None, write_nonpassing: bool = None)

Have output manager write molecule sdf files for passing results in given results bookmark

Parameters:

sdf_path (str, optional) – Optional path existing or to be created in cd where SDF files will be saved
all_in_one (bool, optional) – If True will write all molecules to one SDF (separated by $$$$), if False will write one molecule pre SDF
bookmark_name (str, optional) – Option to run over specified bookmark other than that just used for filtering
write_nonpassing (bool, optional) – Option to include non-passing poses for passing ligands

Raises:

StorageError – if bookmark or data not found

ringtail.ringtailoptions module

class ringtail.ringtailoptions.Filters

Bases: RTOptions

Object that holds all optional filters.

checks(): Ensures all values are internally consistent and valid. Runs once after all values are set initially, then every time a value is changed.

classmethod get_filter_keys(group) → list

Provide keys associated with each of the filter groups. :param group: includese property filters, interaction filters, ligand filters, or all filters :type group: str

Returns:: list of filter keywords associated with the specified group(s)

options = {'ebest': {'default': None, 'description': 'Specify the best energy value accepted.', 'type': <class 'float'>}, 'eworst': {'default': None, 'description': 'Specify the worst energy value accepted.', 'type': <class 'float'>}, 'hb_count': {'default': None, 'description': "Accept ligands with at least the requested number of HB interactions. If a negative number is provided, then accept ligands with no more than the requested number of interactions. E.g., [('hb_count', 5)].", 'type': <class 'list'>}, 'hb_interactions': {'default': [], 'description': "Define HB (ligand acceptor or donor) interaction as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [('A:VAL:279:', True), ('A:LYS:162:', True)] -> [('chain:resname:resid:atomname', <wanted (bool)>), ('chain:resname:resid:atomname', <wanted (bool)>)].", 'type': <class 'list'>}, 'le_percentile': {'default': None, 'description': 'Specify the worst ligand efficiency percentile accepted. Express as percentage e.g. 1 for top 1 percent.', 'type': <class 'float'>}, 'lebest': {'default': None, 'description': 'Specify the best ligand efficiency value accepted.', 'type': <class 'float'>}, 'leworst': {'default': None, 'description': 'Specify the worst ligand efficiency value accepted.', 'type': <class 'float'>}, 'ligand_max_atoms': {'default': None, 'description': 'Maximum number of heavy atoms a ligand may have.', 'type': <class 'int'>}, 'ligand_name': {'default': None, 'description': "Specify list of ligand name(s). Will combine name filters with 'OR'", 'type': <class 'list'>}, 'ligand_operator': {'default': None, 'description': "Logical join operator for multiple substruct filters. Will apply within 'ligand_substruct' filters and within 'ligand_substruct_pos' filters (the two groups are always joined by 'AND').", 'type': <class 'str'>}, 'ligand_substruct': {'default': None, 'description': "SMARTS pattern(s) for substructure matching. Will be evaluated as 'this' OR 'that' unless specified by using the ligand_operator. If error delimit each substructure with ''.", 'type': <class 'list'>}, 'ligand_substruct_pos': {'default': None, 'description': "SMARTS pattern(s) for substructure matching. For API use list with six elements ['[Oh]C', 0, 1.2, -5.5, 10.0, 15.5] -> ['smart_string', index_of_positioned_atom, cutoff_distance, x, y, z]. For the CLI use as a string without comma separators, separating each filter with commas -> '[Oh]C 0 1.2 -5.5 10.0 15.5'. Will be evaluated as 'this' OR 'that' unless specified by using the ligand_operator", 'type': <class 'list'>}, 'max_miss': {'default': 0, 'description': "Will compute all possible combinations of interaction filters excluding up to 'max_miss' number of interactions from given set. Default will only return union of poses interaction filter combinations. Use with 'enumerate_interaction_combs' for enumeration of poses passing each individual combination of interaction filters.", 'type': <class 'int'>}, 'react_any': {'default': None, 'description': 'Check if ligand reacted with any residue.', 'type': <class 'bool'>}, 'reactive_interactions': {'default': [], 'description': "Check if ligand reacted with specified residue as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [('A:VAL:279:', True), ('A:LYS:162:', True)] -> [('chain:resname:resid:atomname', <wanted (bool)>), ('chain:resname:resid:atomname', <wanted (bool)>)].", 'type': <class 'list'>}, 'score_percentile': {'default': None, 'description': 'Specify the worst energy percentile accepted. Express as percentage e.g. 1 for top 1 percent.', 'type': <class 'float'>}, 'vdw_interactions': {'default': [], 'description': "Define van der Waals interactions with residue as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [('A:VAL:279:', True), ('A:LYS:162:', True)] -> [('chain:resname:resid:atomname', <wanted (bool)>), ('chain:resname:resid:atomname', <wanted (bool)>)].", 'type': <class 'list'>}}

class ringtail.ringtailoptions.GeneralOptions

Bases: RTOptions

Object that holds choices and default values for miscellaneous arguments used for the command line interface only.

checks(): Ensures all values are internally consistent and valid. Runs once after all values are set initially, then every time a value is changed.

options = {'db_file': {'default': 'output.db', 'description': 'DB file for which to use for all Ringtail activities.', 'type': <class 'str'>}, 'debug': {'default': None, 'description': 'Print additional error information to STDOUT and to log.', 'type': <class 'bool'>}, 'docking_mode': {'default': 'dlg', 'description': "specify AutoDock program used to generate results. Available options are 'DLG' and 'vina'. Will automatically change --file_pattern to *.dlg* for DLG and *.pdbqt* for vina.", 'type': <class 'str'>}, 'print_summary': {'default': None, 'description': 'prints summary information about stored data to STDOUT.', 'type': <class 'bool'>}, 'verbose': {'default': None, 'description': 'Print results passing filtering criteria to STDOUT and to log. NOTE: runtime may be slower option used.', 'type': <class 'bool'>}}

class ringtail.ringtailoptions.InputFiles

Bases: RTOptions

Class that handles sources of data to be written including ligand data paths and how to traverse them, and options to store receptor.

checks(): Ensures all values are internally consistent and valid. Runs once after all values are set initially, then every time a value is changed.

options = {'file': {'default': None, 'description': 'Ligand docking output file to save. Compressed (.gz) files allowed. Only results files associated the same receptor allowed.', 'type': <class 'list'>}, 'file_list': {'default': None, 'description': 'Text file(s) containing the list of docking output files to save; relative or absolute paths are allowed. Compressed (.gz) files allowed.', 'type': <class 'list'>}, 'file_path': {'default': None, 'description': 'Directory(s) containing docking output files to save. Compressed (.gz) files allowed', 'type': <class 'list'>}, 'file_pattern': {'default': None, 'description': "Specify which pattern to use when searching for result files to process (only with 'file_path').", 'type': <class 'str'>}, 'receptor_file': {'default': None, 'description': 'Use with Vina mode. Give file for receptor PDBQT.', 'type': <class 'str'>}, 'recursive': {'default': None, 'description': "Enable recursive directory scan when 'file_path' is used.", 'type': <class 'bool'>}, 'save_receptor': {'default': None, 'description': "Saves receptor PDBQT to database. Receptor location must be specied with in 'receptor_file'.", 'type': <class 'bool'>}, 'target': {'default': None, 'description': "Name of receptor. This field is autopopulated if 'receptor_file' is supplied.", 'type': <class 'str'>}}

class ringtail.ringtailoptions.InputStrings

Bases: RTOptions

Class that handles docking results strings from vina docking, with options to store receptor. Takes docking results string as a dictionary of: {ligand_name: docking_result}

checks(): Ensures all values are internally consistent and valid. Runs once after all values are set initially, then every time a value is changed.

options = {'receptor_file': {'default': None, 'description': 'Use with Vina mode. Give file for receptor PDBQT.', 'type': <class 'str'>}, 'results_strings': {'default': None, 'description': 'A dictionary of ligand names and ligand docking output results. Currently only valid for vina docking', 'type': <class 'dict'>}, 'save_receptor': {'default': None, 'description': "Saves receptor PDBQT to database. Receptor location must be specied with in 'receptor_file'.", 'type': <class 'bool'>}, 'target': {'default': None, 'description': "Name of receptor. This field is autopopulated if 'receptor_file' is supplied.", 'type': <class 'str'>}}

class ringtail.ringtailoptions.OutputOptions

Bases: RTOptions

Class that holds options related to reading and output from the database, including format for result export and alternate ways of displaying the data (plotting).

checks(): Ensures all values are internally consistent and valid. Runs once after all values are set initially, then every time a value is changed.

options = {'enumerate_interaction_combs': {'default': None, 'description': "When used with 'max_miss' > 0, will log ligands/poses passing each separate interaction filter combination as well as union of combinations. Can significantly increase runtime.", 'type': <class 'bool'>}, 'export_sdf_path': {'default': '', 'description': "Specify the path where to save poses of ligands passing the filters (SDF format); if the directory does not exist, it will be created; if it already exist, it will throw an error, unless the 'overwrite' is used NOTE: the log file will be automatically saved in this path. Ligands will be stored as SDF files in the order specified.", 'type': <class 'str'>}, 'individual_sdf_files': {'default': False, 'description': 'Use if you like to print chosen molecules to individual SDF files, as opposed to one big SDF.', 'type': <class 'bool'>}, 'log_file': {'default': 'output_log.txt', 'description': "By default, read and filtering results are saved in 'output_log.txt'; if this option is used, ligands and requested info passing the filters will be written to specified file.", 'type': <class 'str'>}}

class ringtail.ringtailoptions.RTOptions

Bases: object

Holds standard methods for the ringtail option child classes. Options can be added using this format: options = {

“”:{
“default”:’’, “type”:’’, “description”: “”

},

}

initialize_from_dict(dict: dict, name)

Initializes a child objects using the values available in their option dictionary.

Parameters:

dict (dict) – of attributes to be initialized to the object
name (str) – name of the childclass/object

classmethod is_valid_path(path)

Checks if path exist in current directory.

Parameters:: path (str)
Returns:: if path exist
Return type:: bool

todict(): Return class and its attributes as a dict of native types and not as objects (which they are if they are type checked using TypeSafe).

static valid_bookmark_name(name) → bool

Checks that bookmark name adheres to sqlite naming conventions of alphanumerical and limited symbols.

Parameters:: name (str) – bookmark name
Returns:: true if bookmark name is valid
Return type:: bool

class ringtail.ringtailoptions.ReadOptions

Bases: RTOptions

Object that holds choices and default values for read and export modes, mostly used for the command line interface.

checks(): Ensures all values are internally consistent and valid. Runs once after all values are set initially, then every time a value is changed.

options = {'data_from_bookmark': {'default': None, 'description': 'Write log of --outfields data for bookmark specified by --bookmark_name. Must use without any filters.', 'type': <class 'bool'>}, 'export_bookmark_csv': {'default': None, 'description': 'Create csv of the bookmark given with bookmark_name. Output as <bookmark_name>.csv. Can also export full database tables.', 'type': <class 'str'>}, 'export_bookmark_db': {'default': None, 'description': 'Export a database containing only the results found in the bookmark specified by --bookmark_name. Will save as <input_db>_<bookmark_name>.db', 'type': <class 'bool'>}, 'export_query_csv': {'default': None, 'description': 'Create csv of the requested SQL query. Output as query.csv. MUST BE PRE-FORMATTED IN SQL SYNTAX e.g. SELECT [columns] FROM [table] WHERE [conditions]', 'type': <class 'str'>}, 'export_receptor': {'default': None, 'description': 'Export stored receptor pdbqt. Will write to current directory.', 'type': <class 'bool'>}, 'find_similar_ligands': {'default': None, 'description': 'Allows user to find similar ligands to given ligand name based on previously performed morgan fingerprint or interaction clustering.', 'type': <class 'str'>}, 'plot': {'default': None, 'description': 'Makes scatterplot of LE vs Best Energy, saves as scatter.png.', 'type': <class 'bool'>}, 'pymol': {'default': None, 'description': 'Lauch PyMOL session and plot of ligand efficiency vs docking score for molecules in bookmark specified with --bookmark_name. Will display molecule in PyMOL when clicked on plot. Will also open receptor if given.', 'type': <class 'bool'>}}

class ringtail.ringtailoptions.ResultsProcessingOptions

Bases: RTOptions

Class that holds database write options that affects write time, such as how to break up data files, number of computer processes to use, and and how many poses to store.

checks(): Ensures all values are internally consistent and valid. Runs once after all values are set initially, then every time a value is changed.

options = {'add_interactions': {'default': False, 'description': "Find interactions between ligand poses and receptor and save to database. Requires receptor PDBQT to be given with input files (all modes) and 'receptor_file' to be specified with Vina mode. SIGNIFICANTLY INCREASES DATBASE WRITE TIME.", 'type': <class 'bool'>}, 'interaction_cutoffs': {'default': [3.7, 4.0], 'description': "Use with 'add_interactions', specify distance cutoffs for measuring interactions between ligand and receptor in angstroms. Give as string, separating cutoffs for hydrogen bonds and VDW with comma (in that order). E.g. '-ic 3.7,4.0' will set the cutoff for hydrogen bonds to 3.7 angstroms and for VDW to 4.0. These are the default cutoffs.", 'type': <class 'list'>}, 'interaction_tolerance': {'default': None, 'description': 'Will add the interactions for poses within some tolerance RMSD range of the top pose in a cluster to that top pose. Can use as flag with default tolerance of 0.8 for cmd line tool, or give other value as desired (cmd line and api). Only compatible with ADGPU mode.', 'type': <class 'float'>}, 'max_poses': {'default': 3, 'description': 'Store top pose for top n clusters.', 'type': <class 'int'>}, 'max_proc': {'default': None, 'description': 'Maximum number of processes to create during parallel file parsing. Defaults to number of CPU processors.', 'type': <class 'int'>}, 'store_all_poses': {'default': False, 'description': "Store all poses from input files. Overrides 'max_poses'.", 'type': <class 'bool'>}}

class ringtail.ringtailoptions.StorageOptions

Bases: RTOptions

Class that handles options for the storage (database) manager class, including conflict handling, and results clustering and ordering.

checks(): Ensures all values are internally consistent and valid. Runs once after all values are set initially, then every time a value is changed.

options = {'bookmark_name': {'default': 'passing_results', 'description': "name for resulting book mark file. Default value is 'passing_results'", 'type': <class 'str'>}, 'duplicate_handling': {'default': None, 'description': "Specify how duplicate Results rows should be handled when inserting into database. Options are 'ignore' or 'replace'. Default behavior (no option provided) will allow duplicate entries.", 'type': <class 'str'>}, 'filter_bookmark': {'default': None, 'description': 'Perform filtering over specified bookmark.', 'type': <class 'str'>}, 'interaction_cluster': {'default': None, 'description': 'Cluster filtered ligands by Tanimoto distance of interaction fingerprints with Butina clustering and output ligand with lowest ligand efficiency from each cluster. Useful for enhancing selection of ligands with diverse interactions.', 'type': <class 'float'>}, 'mfpt_cluster': {'default': None, 'description': 'Cluster filtered ligands by Tanimoto distance of Morgan fingerprints with Butina clustering and output ligand with lowest ligand efficiency from each cluster. Useful for selecting chemically dissimilar ligands.', 'type': <class 'float'>}, 'order_results': {'default': None, 'description': "Stipulates how to order the results when written to the log file. By default will be ordered by order results were added to the database. ONLY TAKES ONE OPTION.\n available fields are: \n 'e' (docking_score), \n 'le' (ligand efficiency), \n 'delta' (delta energy from best pose), \n 'ref_rmsd' (RMSD to reference pose), \n 'e_inter' (intermolecular energy), \n 'e_vdw' (van der waals energy), \n 'e_elec' (electrostatic energy), \n 'e_intra' (intermolecular energy), \n 'n_interact' (number of interactions), \n 'rank' (rank of ligand pose), \n 'run' (run number for ligand pose), \n 'hb' (hydrogen bonds); ", 'type': <class 'str'>}, 'outfields': {'default': 'Ligand_name,e', 'description': "Defines which fields are used when reporting the results (to stdout and to the log file). Fields are specified as comma-separated values, e.g. 'outfields=e,le,hb'; by default, docking_score (energy) and ligand name are reported. Ligand always reported in first column available fields are: \n\n 'Ligand_name' (Ligand name), \n 'e' (docking_score), \n 'le' (ligand efficiency), \n 'delta' (delta energy from best pose), \n 'ref_rmsd' (RMSD to reference pose), \n 'e_inter' (intermolecular energy), \n 'e_vdw' (van der waals energy), \n 'e_elec' (electrostatic energy), \n 'e_intra' (intermolecular energy), \n 'n_interact' (number of iteractions), \n 'ligand_smile' , \n 'rank' (rank of ligand pose), \n 'run' (run number for ligand pose), \n 'hb' (hydrogen bonds), \n 'receptor' (receptor name);", 'type': <class 'str'>}, 'output_all_poses': {'default': None, 'description': 'By default, will output only top-scoring pose passing filters per ligand. This flag will cause each pose passing the filters to be logged.', 'type': <class 'bool'>}, 'overwrite': {'default': None, 'description': "This option will allow overwriting of the database (in 'write'/add files-mode) and filtering log_file (in 'read'/filtering mode).", 'type': <class 'bool'>}}

order_options = {'delta', 'e', 'e_elec', 'e_inter', 'e_intra', 'e_vdw', 'hb', 'le', 'n_interact', 'rank', 'ref_rmsd', 'run'}

class ringtail.ringtailoptions.TypeSafe(default, type, object_name)

Bases: object

Class that handles safe typesetting of values of a specified built-in type. Any attribute can be set as a TypeSafe object, this ensures its type is checked whenever it is changed. This makes the attribute of type ‘object’ as opposed to its actual type. To return the value of an attribute as a native type value, you can create a ‘__getattribute__’ method in the class that holds the attribute (see e.g., RTOptions).

It is the hope to extend this to work with custom types, such as “percentage” (float with a max and min value), and direcotry (string that must end with ‘/’).

Parameters:

object_name (str) – name of type safe instance
type (type) – any of the native types in python that the instance must adhere to
default (any) – default value of the object, can be any including None
value (any) – value of type type assigned to instance, can be same or different than default

Raises:

OptionError – if wrong type is attempted.

ringtail.storagemanager module

class ringtail.storagemanager.StorageManager

Bases: object

check_passing_bookmark_exists(bookmark_name: str = None)

Checks if bookmark name is in database

Parameters:: bookmark_name (str, optional) – name of bookmark name to check if exist, or else will use storageman bookmark_name attribute
Returns:: indicates if bookmark_name exists in the current database
Return type:: bool

check_storage_compatibility()

Checks if chosen storage type has been implemented

Parameters:: storage_type (str) – name of the storage type
Raises:: NotImplementedError – raised if seelected storage type has not been implemented
Returns:: of implemented storage type
Return type:: class

close_storage(attached_db=None, vacuum=False)

Close connection to database

Parameters:

attached_db (str, optional) – name of attached DB (not including file extension)
vacuum (bool, optional) – indicates that database should be vacuumed before closing

crossref_filter(new_db: str, bookmark1_name: str, bookmark2_name: str, selection_type='-', old_db=None) → tuple

Selects ligands found or not found in the given bookmark in both current db and new_db. Stores as temp view

Parameters:

new_db (str) – file name for database to attach
bookmark1_name (str) – string for name of first bookmark/temp table to compare
bookmark2_name (str) – string for name of second bookmark to compare
selection_type (str) – “+” or “-” indicating if ligand names should (“+”) or should not “-” be in both databases
old_db (str, optional) – file name for previous database

Returns:

(name of new bookmark (str), number of ligands passing new bookmark (int))

Return type:

tuple

field_to_column_name = {'Ligand_name': 'LigName', 'delta': 'deltas', 'e': 'docking_score', 'e_elec': 'energies_electro', 'e_inter': 'energies_inter', 'e_intra': 'energies_intra', 'e_vdw': 'energies_vdw', 'hb': 'num_hb', 'interactions': 'interactions', 'le': 'leff', 'ligand_smile': 'ligand_smile', 'n_interact': 'nr_interactions', 'rank': 'pose_rank', 'receptor': 'receptor', 'ref_rmsd': 'reference_rmsd', 'run': 'run_number'}

filter_results(all_filters: dict, suppress_output=False) → iter

Generate and execute database queries from given filters.

Parameters:

all_filters (dict) – dict containing all filters. Expects format and keys corresponding to ringtail.Filters().todict()
suppress_output (bool) – prints filtering summary to sdout

Returns:

iterable, such as an sqlite cursor, of passing results

Return type:

iter

finalize_database_write(): Methods to finalize when a database has been written to, and saving the current database schema to the sqlite database.

get_plot_data(bookmark_name: str = None, only_passing=False)

This function is expected to return an ascii plot representation of the results

Parameters:

bookmark_name (str) – name of bookmark for which to fetch passing data. Will use default bookmark name if None. Returns empty list if bookmark does not exist.
only_passing (bool) – Only return data for passing ligands. Will return empty list for all data.

Returns:

cursors as (<all data cursor>, <passing data cursor>)

Return type:

tuple

insert_data(results_array, ligands_array, interaction_list, receptor_array=[], insert_receptor=False)

Inserts data from all arrays returned from results manager.

Parameters:

results_array (list) – list of data to be stored in Results table
ligands_array (list) – list of data to be stored in Ligands table
interaction_list (list) – list of data to be stored in interaction tables
receptor_array (list) – list of data to be stored in Receptors table
insert_receptor (bool, optional) – flag indicating that receptor info should inserted

insert_interactions(Pose_IDs: list, interactions_list, duplicates)

Takes list of interactions, inserts into database

Parameters:

Pose_IDs (list(int)) – list of pose ids assigned while writing the current results to database
interactions_list (list) – List of tuples for interactions in form (“type”, “chain”, “residue”, “resid”, “recname”, “recid”)
duplicates (list(Pose_ID)) – any duplicates identified in “insert_results”, if duplicate handling has been specified

prune(): Deletes rows from results, ligands, and interactions in a bookmark if they do not pass filtering criteria

class ringtail.storagemanager.StorageManagerSQLite(db_file: str = None, overwrite: bool = None, order_results: str = None, outfields: str = None, filter_bookmark: str = None, output_all_poses: bool = None, mfpt_cluster: float = None, interaction_cluster: float = None, bookmark_name: str = None, duplicate_handling: str = None)

Bases: StorageManager

SQLite-specific StorageManager subclass

conn

Connection to database

Type:: SQLite.conn

open_cursors

list of cursors that were not closed by the function that created them. Will be closed by close_connection method.

Type:: list

db_file

database name

Type:: str

overwrite

switch to overwrite database if it exists

Type:: bool

order_results

what column name will be used to order results once read

Type:: str

outfields

data fields/columns to include when reading and outputting data

Type:: str

filter_bookmark

name of bookmark that filtering will be performed over

Type:: str

output_all_poses

whether or not to output all poses of a ligand

Type:: bool

mfpt_cluster

distance in ångströms to cluster ligands based on morgan fingerprints

Type:: float

interaction_cluster

distance in ångströms to cluster ligands based on interactions

Type:: float

bookmark_name

name of current bookmark being written to or read from

Type:: str

duplicate_handling

optional attribute to deal with insertion of ligands already in the database

Type:: str

current_bookmark_name

name of last view to have been written to in the database

Type:: str

filtering_window

name of bookmark/view being filtered on

Type:: str

index_columns

Type:: list

view_suffix

current suffix for views

Type:: int

temptable_suffix

current suffix for temporary tables

Type:: int

field_to_column_name

Dictionary for converting ringtail options into DB column names

Type:: dict

bookmark_has_rows(bookmark_name: str) → bool

Method that checks if a given bookmark has any data in it

Parameters:: bookmark_name (str) – view to check
Returns:: True if more than zero rows in bookmark
Return type:: bool

check_ringtaildb_version()

Checks the database version and confirms whether the code base is compatible with it

Returns:: whether or not db is compatible with the code base str: current database versions
Return type:: bool

check_storage_ready(run_mode: str, docking_mode: str, store_all_poses: bool, max_poses: int)

Check that storage is ready before proceeding, and creates new tables if needed

Parameters:

run_mode (str) – if ringtail is ran using cmd line interface or api
docking_mode (str) – what docking engine was used to produce results
store_all_poses (bool) – overrwrites max poses
max_poses (int) – max poses to save to db

Raises:

StorageError –
OptionError – if database options are not compatible

clone(backup_name=None)

Creates a copy of the db

Parameters:: backup_name (str, optional) – name of the cloned database

count_receptors_in_db()

returns number of rows in Receptors table where receptor_object already has blob

Returns:: number of rows in receptors table str: name of receptor if present in table
Return type:: int
Raises:: DatabaseQueryError –

create_bookmark(name, query, temp=False, add_poseID=False, filters={})

Takes name and selection query and creates a bookmark of name. Bookmarks are Ringtail specific views that whose information is stored in the ‘Bookmark’ table. #FIXME bug where ligand filter only results are not added as bookmarks

Parameters:

name (str) – Name for bookmark which will be created
query (str) – SQLite-formated query used to create bookmark
temp (bool, optional) – Flag if bookmark should be temporary
add_poseID (bool, optional) – Add Pose_ID column to bookmark
filters (dict, optional) – a dict of filters used to construct the query

create_bookmark_from_temp_table(temp_table_name, bookmark_name, original_bookmark_name, wanted_list, unwanted_list=[])

Resaves temp bookmark stored in self.current_bookmark_name as new permenant bookmark

Parameters:

bookmark_name (str) – name of bookmark to save last temp bookmark as
original_bookmark_name (str) – name of original bookmark
wanted_list (list) – List of wanted database names
unwanted_list (list, optional) – List of unwanted database names
temp_table_name (str) – name of temporary table

create_temp_table_from_bookmark(): Method that creates a temporary table named “passing_temp”. Please note that this table will be dropped as soon as the database connection closes.

drop_bookmark(bookmark_name: str)

Drops specified bookmark from database

Parameters:: bookmark_name (str) – bookmark to be dropped
Raises:: DatabaseInsertionError –

fetch_bookmark(bookmark_name: str) → Cursor

returns SQLite cursor of all fields in bookmark

Parameters:: bookmark_name (str) – name of bookmark to retrieve
Returns:: cursor of requested view
Return type:: sqlite3.Cursor

fetch_clustered_similars(ligname: str)

Given ligname, returns poseids for similar poses/ligands from previous clustering. User prompted at runtime to choose cluster.

Parameters:

ligname (str) – ligname for ligand to find similarity with

Raises:

ValueError – wrong terminal input
DatabaseQueryError –

fetch_data_for_passing_results() → iter

Will return SQLite cursor with requested data for outfields for poses that passed filter in self.bookmark_name

Returns:: sqlite cursor of data from passing data
Return type:: iter
Raises:: OptionError –

fetch_filters_from_bookmark(bookmark_name: str = None)

Method that will retrieve filter values used to construct bookmark

Parameters:

bookmark_name (str, optional) – can get filter values for given bookmark, or filter values from currently active bookmark in storageman
Returns – dict: containing the filter data

fetch_flexres_info()

fetch flexible residues names and atomname lists

Returns:: (flexible_residues, flexres_atomnames)
Return type:: tuple

fetch_interaction_info_by_index(interaction_idx) → tuple

Returns tuple containing interaction info for given interaction_idx

Parameters:: interaction_idx (int) – interaction index to fetch info for
Returns:: tuple of info for requested interaction
Return type:: tuple

fetch_nonpassing_pose_properties(ligname)

fetch coordinates for poses of ligname which did not pass the filter

Parameters:

ligname (str) – name of ligand to fetch coordinates for

Returns:

SQLite cursor that contains Pose_ID, docking_score, leff, ligand_coordinates,: flexible_res_coordinates, flexible_residues

Return type:

iter

fetch_passing_ligand_output_info() → iter

fetch information required by vsmanager for writing out molecules

Returns:

contains LigName, ligand_smile,: atom_index_map, hydrogen_parents

Return type:

iter

fetch_passing_pose_properties(ligname)

fetch coordinates for poses passing filter for given ligand

Parameters:

ligname (str) – name of ligand to fetch coordinates for

Returns:

SQLite cursor that contains Pose_ID, docking_score, leff, ligand_coordinates,: flexible_res_coordinates, flexible_residues

Return type:

iter

fetch_pose_interactions(Pose_ID) → iter

Fetch all interactions parameters belonging to a Pose_ID

Parameters:: Pose_ID (int) – pose id, 1-1 with Results table
Returns:: of interaction information for given Pose_ID
Return type:: iter

fetch_receptor_object_by_name(rec_name)

Returns Receptor object from database for given rec_name

Parameters:: rec_name (str) – Name of receptor to return object for

Returns: str: receptor object as a string

fetch_receptor_objects()

Returns all Receptor objects from database

Parameters:: rec_name (str) – Name of receptor to return object for
Returns:: of receptor names and objects
Return type:: iter (tuple)

fetch_single_ligand_output_info(ligname) → str

get output information for given ligand

Parameters:: ligname (str) – ligand name
Raises:: DatabaseQueryError –
Returns:: information containing smiles, atom and index mapping, and hydrogen parents
Return type:: str

fetch_single_pose_properties(pose_ID: int) → iter

fetch coordinates for pose given by pose_ID

Parameters:

pose_ID (int) – name of ligand to fetch coordinates for

Returns:

SQLite cursor that contains Pose_ID, docking_score, leff, ligand_coordinates,: flexible_res_coordinates, flexible_residues

Return type:

iter

fetch_summary_data(columns=['docking_score', 'leff'], percentiles=[1, 10]) → dict

Collect summary data for database:

Num Ligands Num stored poses Num unique interactions

min, max, percentiles for columns in columns

Parameters:

columns (list (str)) – columns to be displayed and used in summary
percentiles (list(int)) – percentiles to consider

Returns:

of data summary

Return type:

dict

classmethod format_for_storage(ligand_dict: dict) → tuple

takes file dictionary from the file parser, formats required storage format

Parameters:

ligand_dict (dict) – Dictionary containing data from the fileparser

Returns:

of lists ([result_row_1, result_row_2,…],: ligand_row, [interaction_tuple_1, interaction_tuple_2, …])

Return type:

tuple

get_all_bookmark_names()

Get all bookmarks in sql database as a list of names. Bookmarks are a specific type of sqlite-views whose information is stored in the Bookmarks table.

Returns:: of bookmark names
Return type:: list

get_current_bookmark_name()

returns current bookmark name

Returns:: name of last passing results bookmark used by database
Return type:: str

get_maxmiss_union(total_combinations: int)

Get results that are in union considering max miss

Parameters:: total_combinations (int) – numer of possible combinations
Returns:: of passing results
Return type:: iter

insert_receptor_blob(receptor, rec_name)

Takes object of Receptor class, updates the column in Receptor table

Parameters:

receptor (bytes) – bytes receptor object to be inserted into DB
rec_name (string) – Name of receptor. Used to insert into correct row of DB

Raises:

DatabaseInsertionError – Description

overwrite_storage(): Will drop all tables in the database.

set_bookmark_suffix(suffix)

Sets internal bookmark_suffix variable

Parameters:: suffix (str) – suffix to attached to bookmark-related queries or creation

to_dataframe(requested_data: str, table=True) → pandas.DataFrame

Returns a panda dataframe of table or query given as requested_data

Parameters:

requested_data (str) – String containing SQL-formatted query or table name
table (bool) – Flag indicating if requested_data is table name or not

Returns:

dataframe of requested data

Return type:

pd.DataFrame

update_database_version(new_version, consent=False)

method that updates sqlite database schema 1.0.0 or 1.1.0 to 1.1.0 or 2.0.0

#NOTE: If you created a version 1 database with the duplicate handling option, there is a chance of inconsistent behavior of anything involving interactions as the Pose_ID was not used as an explicit foreign key in db v1.0.0 and v1.1.0.

Parameters:: consent (bool, optional) – variable to ensure consent to update database is explicit
Returns:: bool

Module contents

class ringtail.CLOptionParser

Bases: object

Command line option/argument parser. Options and switches are utilized in the script ‘rt_process_vs.py’.

process_mode

operating in ‘write’ or ‘read’ mode

Type:: str

rtcore

ringtail core object initialized with the provided db_file

Type:: RingtailCore

filters

fully parsed and organied optional filters

Type:: dict

file_sources

fully parsed docking results and receptor files

Type:: dict

writeopts

fully parsed arguments related to database writing

Type:: dict

storageopts

fully parsed arguments related to how the storage system behaves

Type:: dict

outputopts

fully parsed arguments related to output and reading from the database

Type:: dict

print_summary

switch to print database summary

Type:: bool

filtering

switch to run filtering method

Type:: bool

plot

switch to plot the data

Type:: bool

export_bookmark_db

switch to export bookmark as a new database

Type:: bool

export_receptor

switch to export receptor information to pdbqt

Type:: bool

pymol

switch to visualize ligands in pymol

Type:: bool

data_from_bookmark

switch to write bookmark data to the output log file

Type:: bool

Raises:: OptionError – Error when an option cannot be parsed correctly

process_options(parsed_opts)

Process and organize command line options to into ringtail options and filter dictionaries and ringtail core attributes

Parameters:: parsed_opts (argparse.Namespace) – arguments provided through the cmdline_parser method.

exception ringtail.DatabaseConnectionError: Bases: StorageError

exception ringtail.DatabaseInsertionError: Bases: StorageError

exception ringtail.DatabaseQueryError: Bases: StorageError

exception ringtail.DatabaseTableCreationError: Bases: StorageError

exception ringtail.DatabaseViewCreationError: Bases: StorageError

class ringtail.DockingFileReader(*args: Any, **kwargs: Any)

Bases: Process

This class is the individual worker for processing docking results. One instance of this class is instantiated for each available processor.

queueIn

current queue for the processor/file reader

Type:: multiprocess.Queue

queueOut

queue for the processor/file reader after adding or removing an item

Type:: multiprocess.Queue

pipe_conn

pipe connection to the reader

Type:: multiprocess.Pipe

storageman

storageman object

Type:: StorageManager

storageman_class

storagemanager child class/database type

Type:: StorageManager

docking_mode

describes what docking engine was used to produce the results

Type:: str

max_poses

max number of poses to store for each ligand

Type:: int

interaction_tolerance

Will add the interactions for poses within some tolerance RMSD range of the top pose in a cluster to that top pose.”

Type:: float

store_all_poses

Store all poses from docking results

Type:: bool

add_interactions

find and save interactions between ligand poses and receptor

Type:: bool

interaction_cutoffs

cutoff for interactions of hydrogen bonds and VDW interactions, in ångströms

Type:: list(float)

target

receptor name

Type:: str

run()

Method overload from parent class .This is where the task of this class is performed. Each multiprocess.Process class must have a “run” method which is called by the initialization (see below) with start()

Raises:

NotImplementedError – if parser for specific docking result type is not implemented
FileParsingError –

exception ringtail.FileParsingError: Bases: Exception

class ringtail.Filters

Bases: RTOptions

Object that holds all optional filters.

checks(): Ensures all values are internally consistent and valid. Runs once after all values are set initially, then every time a value is changed.

classmethod get_filter_keys(group) → list

Provide keys associated with each of the filter groups. :param group: includese property filters, interaction filters, ligand filters, or all filters :type group: str

Returns:: list of filter keywords associated with the specified group(s)

options = {'ebest': {'default': None, 'description': 'Specify the best energy value accepted.', 'type': <class 'float'>}, 'eworst': {'default': None, 'description': 'Specify the worst energy value accepted.', 'type': <class 'float'>}, 'hb_count': {'default': None, 'description': "Accept ligands with at least the requested number of HB interactions. If a negative number is provided, then accept ligands with no more than the requested number of interactions. E.g., [('hb_count', 5)].", 'type': <class 'list'>}, 'hb_interactions': {'default': [], 'description': "Define HB (ligand acceptor or donor) interaction as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [('A:VAL:279:', True), ('A:LYS:162:', True)] -> [('chain:resname:resid:atomname', <wanted (bool)>), ('chain:resname:resid:atomname', <wanted (bool)>)].", 'type': <class 'list'>}, 'le_percentile': {'default': None, 'description': 'Specify the worst ligand efficiency percentile accepted. Express as percentage e.g. 1 for top 1 percent.', 'type': <class 'float'>}, 'lebest': {'default': None, 'description': 'Specify the best ligand efficiency value accepted.', 'type': <class 'float'>}, 'leworst': {'default': None, 'description': 'Specify the worst ligand efficiency value accepted.', 'type': <class 'float'>}, 'ligand_max_atoms': {'default': None, 'description': 'Maximum number of heavy atoms a ligand may have.', 'type': <class 'int'>}, 'ligand_name': {'default': None, 'description': "Specify list of ligand name(s). Will combine name filters with 'OR'", 'type': <class 'list'>}, 'ligand_operator': {'default': None, 'description': "Logical join operator for multiple substruct filters. Will apply within 'ligand_substruct' filters and within 'ligand_substruct_pos' filters (the two groups are always joined by 'AND').", 'type': <class 'str'>}, 'ligand_substruct': {'default': None, 'description': "SMARTS pattern(s) for substructure matching. Will be evaluated as 'this' OR 'that' unless specified by using the ligand_operator. If error delimit each substructure with ''.", 'type': <class 'list'>}, 'ligand_substruct_pos': {'default': None, 'description': "SMARTS pattern(s) for substructure matching. For API use list with six elements ['[Oh]C', 0, 1.2, -5.5, 10.0, 15.5] -> ['smart_string', index_of_positioned_atom, cutoff_distance, x, y, z]. For the CLI use as a string without comma separators, separating each filter with commas -> '[Oh]C 0 1.2 -5.5 10.0 15.5'. Will be evaluated as 'this' OR 'that' unless specified by using the ligand_operator", 'type': <class 'list'>}, 'max_miss': {'default': 0, 'description': "Will compute all possible combinations of interaction filters excluding up to 'max_miss' number of interactions from given set. Default will only return union of poses interaction filter combinations. Use with 'enumerate_interaction_combs' for enumeration of poses passing each individual combination of interaction filters.", 'type': <class 'int'>}, 'react_any': {'default': None, 'description': 'Check if ligand reacted with any residue.', 'type': <class 'bool'>}, 'reactive_interactions': {'default': [], 'description': "Check if ligand reacted with specified residue as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [('A:VAL:279:', True), ('A:LYS:162:', True)] -> [('chain:resname:resid:atomname', <wanted (bool)>), ('chain:resname:resid:atomname', <wanted (bool)>)].", 'type': <class 'list'>}, 'score_percentile': {'default': None, 'description': 'Specify the worst energy percentile accepted. Express as percentage e.g. 1 for top 1 percent.', 'type': <class 'float'>}, 'vdw_interactions': {'default': [], 'description': "Define van der Waals interactions with residue as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [('A:VAL:279:', True), ('A:LYS:162:', True)] -> [('chain:resname:resid:atomname', <wanted (bool)>), ('chain:resname:resid:atomname', <wanted (bool)>)].", 'type': <class 'list'>}}

class ringtail.InteractionFinder(rec_string, interaction_cutoff_radii)

Bases: object

Class for handling and calculating ligand-receptor interactions.

rec_string

string describing the receptor

Type:: str

interaction_cutoff_radii

cutoff for interactions of hydrogen bonds and VDW interactions, in ångströms

Type:: list(float)

find_pose_interactions(lig_atomtype_list: list, lig_coordinates: list) → dict

Method that identifies interactions for a pose within th given cutoff distances in the main class.

Parameters:

lig_atomtype_list (list) – list of atoms in the ligand
lig_coordinates (list) – coordinates for the atoms in the ligand

Returns:

all interaction details for a given ligand pose

Return type:

dict

class ringtail.MPManager(docking_mode, max_poses, interaction_tolerance, store_all_poses, add_interactions, interaction_cutoffs, max_proc, storageman, storageman_class, chunk_size, target, receptor_file, file_pattern=None, file_sources=None, string_sources=None)

Bases: object

Manager that orchestrates paralell processing of docking results data, using one of the supported multiprocessors.

docking_mode

describes what docking engine was used to produce the results

Type:: str

max_poses

max number of poses to store for each ligand

Type:: int

interaction_tolerance

Will add the interactions for poses within some tolerance RMSD range of the top pose in a cluster to that top pose.”

Type:: float

store_all_poses

Store all poses from docking results

Type:: bool

add_interactions

find and save interactions between ligand poses and receptor

Type:: bool

interaction_cutoffs

cutoff for interactions of hydrogen bonds and VDW interactions, in ångströms

Type:: list(float)

max_proc

Maximum number of processes to create during parallel file parsing.

Type:: int

storageman

storageman object

Type:: StorageManager

storageman_class

storagemanager child class/database type

Type:: StorageManager

chunk_size

how many tasks ot send to a processor at the time

Type:: int

target

name of receptor

Type:: str

receptor_file

file path to receptor

Type:: str

file_pattern

file pattern to look for if recursively finding results files to process

Type:: str, optional

file_sources

RingtailOption object that holds all attributes related to results files

Type:: InputFiles, optional

string_sources

RingtailOption object that holds all attributes related to results strings

Type:: InputStrings, optional

num_files

number of files processed at any given time

Type:: int

process_results(): Processes results data (files or string sources) by adding them to the queue and starting their processing in multiprocess.

exception ringtail.MultiprocessingError: Bases: Exception

exception ringtail.OptionError: Bases: Exception

exception ringtail.OutputError: Bases: Exception

class ringtail.OutputManager(log_file=None, export_sdf_path=None)

Bases: object

Class for creating outputs, can be a context manager to handle log files

log_file

name for log file

Type:: str

export_sdf_path

path for exporting SDF molecule files

Type:: str

_log_open

if log file is open or not

Type:: bool

close_logfile(): Closes the log file properly and reset file pointer to filename

log_num_passing_ligands(number_passing_ligands: int)

Write the number of ligands which pass given filter to log file

Parameters:: number_passing_ligands (int) – number of ligands that passed filter
Raises:: OutputError –

open_logfile(write_filters_header=True)

Opens log file and creates it if needed

Parameters:: write_filters_header (bool) – only used because one method does not take the same headers
Raises:: OutputError –

plot_all_data(xdata, ydata, num_of_bins: int = 100)

Takes dictionary of binned data where key is the coordinates of the bin and value is the number of points in that bin. Adds to scatter plot colored by value

Parameters:

xdata (list) – list of x axis data (needs to be same length as ydata)
ydata (list) – list of y axis data (needs to be same length as xdata)
num_of_bins (int) – number of bins to organize data in

Returns:

matplotlib.pyplot.figure

Raises:

OutputError –

plot_single_points(x: list, y: list, markersize: int = 20, color='crimson')

Add points to scatter plot with given x and y coordinates and color.

Parameters:

x (float) – x coordinate
y (float) – y coordinate
color (str, optional) – Color for point. Default black.

Raises:

OutputError –

save_scatterplot()

Saves and closes current figure as scatter.png

Raises:: OutputError –

scatter_hist(x, y, z, ax_histx, ax_histy)

Makes scatterplot with a histogram on each axis

Parameters:

x (list) – x coordinates for data
y (list) – y coordinates for data
z (list) – z coordinates for data
ax (matplotlib.axis) – scatterplot axis
ax_histx (matplotlib.axis) – x histogram axis
ax_histy (matplotlib.axis) – y histogram axis

Raises:

OutputError –

write_filter_log(lines)

Writes lines from results iterable into log file

Parameters:: lines (iterable) – Iterable with tuples of data for writing into log
Raises:: OutputError –
Returns:: number of ligands passing that are written to log file
Return type:: int

write_filters_to_log(filters_dict, included_interactions, additional_info='')

Takes dictionary of filters, formats as string and writes to log file

Parameters:

filters_dict (dict) – dictionary with filtering options
included_interactions (list) – types of interactions to include in the filtering
additional_info (str) – any additional information to write to top of log file

Raises:

OutputError –

write_find_similar_header(query_ligname, cluster_name): Properly formats header for the log file find_similar_ligands

write_maxmiss_union_header(): Properly formats header for the log file if using max_miss and enumerate_interaction_combs

write_out_mol(filename, mol, flexres_mols, properties)

Writes out given mol as sdf. Will create the specified sdf folder in current working directory if needed.

Parameters:

filename (str) – name of SDF file that will be written to
mol (RDKit.Chem.Mol) – RDKit molobject to be written to SDF
flexres_mols (list) – dictionary of rdkit molecules for flexible residues
properties (dict) – dictionary of list of properties to add to mol before writing

Raises:

OutputError –

write_receptor_pdbqt(recname: str, receptor_compbytes)

Writes a pdbqt file from receptor “blob”

Parameters:

recname (str) – name of receptor to use in output filename
receptor_compbytes (blob) – receptor blob

write_results_bookmark_to_log(bookmark_name)

Write the name of the result bookmark into log

Parameters:: bookmark_name (str) – name of current results’ bookmark in db
Raises:: OutputError –

exception ringtail.RTCoreError: Bases: Exception

class ringtail.ReceptorManager

Bases: object

Class with methods dealing with formatting of receptor information

static blob2str(receptor_blob)

Creates blob of compresser receptor file info

Parameters:: receptor_blob (blob) – zipped receptor blob
Returns:: receptor string
Return type:: str

static make_receptor_blobs(file_list)

Creates compressed receptor info

Parameters:: file_list (str) – path to receptor file
Returns:: compressed receptor
Return type:: blob

class ringtail.ResultsManager(docking_mode: str = None, max_poses: int = None, interaction_tolerance: float = None, store_all_poses: bool = None, add_interactions: bool = None, interaction_cutoffs: list = None, max_proc: int = None, storageman: StorageManager = None, storageman_class: StorageManager = None, chunk_size: int = 1, parser_manager: str = 'multiprocess', file_sources=None, string_sources=None)

Bases: object

Class that handles the processing of the results, including passing on the docking results to the appropriate paralell/multi-processing unit

Parameters:

max_poses (int) – max number of poses to store for each ligand
interaction_tolerance (float) – Will add the interactions for poses within some tolerance RMSD range of the top pose in a cluster to that top pose.”
store_all_poses (bool) – Store all poses from docking results
add_interactions (bool) – find and save interactions between ligand poses and receptor
interaction_cutoffs (list(float)) – cutoff for interactions of hydrogen bonds and VDW interactions, in ångströms
max_proc (int) – Maximum number of processes to create during parallel file parsing.
storageman (StorageManager) – storageman object
storageman_class (StorageManager) – storagemanager child class/database type
chunk_size (int) – how many tasks ot send to a processor at the time
parser_manager (str, optional) – what paralellization or multiprocessing package to use
file_sources (InputFiles, optional) – given file sources including the receptor file
string_sources (InputStrings, optional) – given string sources including the path to the receptor

Raises:

ResultsProcessingError –

process_docking_data()

Processes docking data in the form of files or strings

Raises:: ResultsProcessingError – if no file or string sources are provided, or if both are provided

exception ringtail.ResultsProcessingError: Bases: Exception

class ringtail.RingtailCore(db_file: str = 'output.db', storage_type: str = 'sqlite', docking_mode: str = 'dlg', logging_level: str = 'WARNING')

Bases: object

Core class for coordinating different actions on virtual screening including adding results to storage, filtering and clusteirng, and outputting data as rdkit molecules, plotting docking results, and visualizing select ligands in pymol.

db_file

name of database file being operated on

Type:: str

docking_mode

specifies what docking mode has been used for the results in the database

Type:: str

storageman

Interface module with database

Type:: StorageManager

resultsman

Module to deal with results processing before adding to database

Type:: ResultsManager

outputman

Manager for output tasks of log-writting, plotting, ligand SDF writing, starting pymol sessions

Type:: OutputManager

filters

object holding all optional filters

Type:: Filters

_run_mode

refers to whether ringtail is ran from the command line or through direct API use, where the former is more restrictive

Type:: str

add_results_from_files(file: str = None, file_path: str = None, file_list: str = None, file_pattern: str = None, recursive: bool = None, receptor_file: str = None, save_receptor: bool = None, filesources_dict: dict = None, duplicate_handling: str = None, overwrite: bool = None, store_all_poses: bool = None, max_poses: int = None, add_interactions: bool = None, interaction_tolerance: float = None, interaction_cutoffs: list = None, max_proc: int = None, options_dict: dict = None, finalize: bool = True)

Call storage manager to process result files and add to database. Creates or adds to an existing a database. Options can be provided as a dict or as individual options. If both are provided, individual options will overwrite those from the dictionary.

Parameters:

(str (file_list) – list(str)): ligand result file
optional – list(str)): ligand result file
(str – list(str)): list of folders containing one or more result files
optional – list(str)): list of folders containing one or more result files
(str – list(str)): list of ligand result file(s)
optional – list(str)): list of ligand result file(s)
file_pattern (str) – file pattern to use with recursive search in a file_path, “.dlg” for AutoDock-GDP and “.pdbqt” for vina
recursive (bool) – used to recursively search file_path for folders inside folders
receptor_file (str) – string containing the receptor .pdbqt
save_receptor (bool) – whether or not to store the full receptor details in the database (needed for some things)
filesources_dict (dict) – file sources already as an object
duplicate_handling (str, options) – specify how duplicate Results rows should be handled when inserting into database. Options are “ignore” or “replace”. Default behavior will allow duplicate entries.
store_all_poses (bool) – store all ligand poses, does it take precedence over max poses?
max_poses (int) – how many poses to save (ordered by soem score?)
add_interactions (bool) – add ligand-receptor interaction data, only in vina mode
interaction_tolerance (float) – longest ångström distance that is considered interaction?
interaction_cutoffs (list) – ångström distance cutoffs for x and y interaction
max_proc (int) – max number of computer processors to use for file reading
options_dict (dict) – write options as a dict

Raises:

OptionError –

add_results_from_vina_string(results_strings: dict = None, receptor_file: str = None, save_receptor: bool = None, resultsources_dict: dict = None, duplicate_handling: str = None, overwrite: bool = None, store_all_poses: bool = None, max_poses: int = None, add_interactions: bool = None, interaction_cutoffs: list = None, max_proc: int = None, options_dict: dict = None, finalize: bool = True)

Call storage manager to process the given vina output string and add to database. Options can be provided as a dict or as individual options. Creates or adds to an existing a database.

Parameters:

results_string (dict) – string containing the ligand identified and docking results as a dictionary
receptor_file (str) – string containing the receptor .pdbqt
save_receptor (bool) – whether or not to store the full receptor details in the database (needed for some things)
resultsources_dict (dict) – file sources already as an object
duplicate_handling (str, options) – specify how duplicate Results rows should be handled when inserting into database. Options are “ignore” or “replace”. Default behavior will allow duplicate entries.
store_all_poses (bool) – store all ligand poses, does it take precedence over max poses?
max_poses (int) – how many poses to save (ordered by soem score?)
add_interactions (bool) – add ligand-receptor interaction data, only in vina mode
interaction_cutoffs (list) – ångström distance cutoffs for x and y interaction
max_proc (int) – max number of computer processors to use for file reading
options_dict (dict) – write options as a dict

Raises:

OptionError –

static default_dict() → dict

Creates a dict of all Ringtail options.

Returns:: json string with options
Return type:: str

display_pymol(bookmark_name=None)

Launch pymol session and plot of LE vs docking score. Displays molecules when clicked.

Parameters:: bookmark_name (str) – bookmark name to use in pymol. Will look for the default bookmark ‘passing_results’ (or last used bookmark) if None is provided.

property docking_mode

Private method to retrieve docking mode

Returns:: docking mode
Return type:: str

drop_bookmark(bookmark_name: str)

Drops specified bookmark from the database

Parameters:: bookmark_name (str) – name of bookmark to be dropped.

export_bookmark_db(bookmark_name: str = None) → str

Export database containing data from bookmark

Parameters:: bookmark_name (str) – name for bookmark_db
Returns:: name of the new, exported database
Return type:: str

export_csv(requested_data: str, csv_name: str, table=False)

Get requested data from database, export as CSV

Parameters:

requested_data (str) – Table name or SQL-formatted query
csv_name (str) – Name for exported CSV file
table (bool) – flag indicating is requested data is a table name

export_receptors(): Export receptor in database to pdbqt

filter(eworst=None, ebest=None, leworst=None, lebest=None, score_percentile=None, le_percentile=None, vdw_interactions=None, hb_interactions=None, reactive_interactions=None, hb_count=None, react_any=None, max_miss=None, ligand_name=None, ligand_operator=None, ligand_substruct=None, ligand_substruct_pos=None, ligand_max_atoms=None, filters_dict: dict = None, enumerate_interaction_combs: bool = False, output_all_poses: bool = None, mfpt_cluster: float = None, interaction_cluster: float = None, log_file: str = None, overwrite: bool = None, order_results: str = None, outfields: str = None, bookmark_name: str = None, filter_bookmark: str = None, options_dict: dict = None, return_iter=False)

Prepare list of filters, then hand it off to storageman to perform filtering. Creates log of all ligand docking results that passes.

Parameters:

Filters – eworst (float): specify the worst energy value accepted ebest (float): specify the best energy value accepted leworst (float): specify the worst ligand efficiency value accepted lebest (float): specify the best ligand efficiency value accepted score_percentile (float): specify the worst energy percentile accepted. Express as percentage e.g. 1 for top 1 percent. le_percentile (float): specify the worst ligand efficiency percentile accepted. Express as percentage e.g. 1 for top 1 percent. vdw_interactions (list[tuple]): define van der Waals interactions with residue as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [(‘A:VAL:279:’, True), (‘A:LYS:162:’, True)] -> [(‘chain:resname:resid:atomname’, <wanted (bool)>), (‘chain:resname:resid:atomname’, <wanted (bool)>)] hb_interactions (list[tuple]): define HB (ligand acceptor or donor) interaction as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [(‘A:VAL:279:’, True), (‘A:LYS:162:’, True)] -> [(‘chain:resname:resid:atomname’, <wanted (bool)>), (‘chain:resname:resid:atomname’, <wanted (bool)>)] reactive_interactions (list[tuple]): check if ligand reacted with specified residue as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [(‘A:VAL:279:’, True), (‘A:LYS:162:’, True)] -> [(‘chain:resname:resid:atomname’, <wanted (bool)>), (‘chain:resname:resid:atomname’, <wanted (bool)>)] hb_count (list[tuple]): accept ligands with at least the requested number of HB interactions. If a negative number is provided, then accept ligands with no more than the requested number of interactions. E.g., [(‘hb_count’, 5)] react_any (bool): check if ligand reacted with any residue max_miss (int): Will compute all possible combinations of interaction filters excluding up to max_miss numer of interactions from given set. Default will only return union of poses interaction filter combinations. Use with ‘enumerate_interaction_combs’ for enumeration of poses passing each individual combination of interaction filters. ligand_name (list[str]): specify ligand name(s). Will combine name filters with OR, e.g., [[“lig1”, “lig2”]] ligand_substruct (list[str]): SMARTS, index of atom in SMARTS, cutoff dist, and target XYZ coords, e.g., [[“ccc”, “CN”]] ligand_substruct_pos (list[list[type]]): SMARTS pattern(s) for substructure matching, e.g., [[“[Oh]C”, 0, 1.2, -5.5, 10.0, 15.5]] -> [[“smart_string”, index_of_positioned_atom, cutoff_distance, x, y, z]] ligand_max_atoms (int): Maximum number of heavy atoms a ligand may have ligand_operator (str): logical join operator for multiple SMARTS (default: OR), either AND or OR filters_dict (dict): provide filters as a dictionary
options (Ligand results) –
enumerate_interaction_combs (bool): When used with max_miss > 0, will log ligands/poses passing each separate interaction filter combination as well as union of combinations. Can significantly increase runtime. output_all_poses (bool): By default, will output only top-scoring pose passing filters per ligand. This flag will cause each pose passing the filters to be logged. mfpt_cluster (float): Cluster filtered ligands by Tanimoto distance of Morgan fingerprints with Butina clustering and output ligand with lowest ligand efficiency from each cluster. Default clustering cutoff is 0.5. Useful for selecting chemically dissimilar ligands. interaction_cluster (float): Cluster filtered ligands by Tanimoto distance of interaction fingerprints with Butina clustering and output ligand with lowest ligand efficiency from each cluster. Default clustering cutoff is 0.5. Useful for enhancing selection of ligands with diverse interactions. log_file (str): by default, results are saved in output_log.txt; if this option is used, ligands and requested info passing the filters will be written to specified file overwrite (bool): by default, if a log file exists, it doesn’t get overwritten and an error is returned; this option enable overwriting existing log files. Will also overwrite existing database order_results (str): Stipulates how to order the results when written to the log file. By default will be ordered by order results were added to the database. ONLY TAKES ONE OPTION. Available fields are:

”e” (docking_score), “le” (ligand efficiency), “delta” (delta energy from best pose), “ref_rmsd” (RMSD to reference pose), “e_inter” (intermolecular energy), “e_vdw” (van der waals energy), “e_elec” (electrostatic energy), “e_intra” (intermolecular energy), “n_interact” (number of interactions), “rank” (rank of ligand pose), “run” (run number for ligand pose), “hb” (hydrogen bonds);

outfields (str): defines which fields are used when reporting the results (to stdout and to the log file); fields are specified as comma-separated values, e.g. –outfields=e,le,hb; by default, docking_score (energy) and ligand name are reported; ligand always reported in first column available fields are:

”Ligand_name” (Ligand name), “e” (docking_score), “le” (ligand efficiency), “delta” (delta energy from best pose), “ref_rmsd” (RMSD to reference pose), “e_inter” (intermolecular energy), “e_vdw” (van der waals energy), “e_elec” (electrostatic energy), “e_intra” (intermolecular energy), “n_interact” (number of iteractions), “ligand_smile” , “rank” (rank of ligand pose), “run” (run number for ligand pose), “hb” (hydrogen bonds), “receptor” (receptor name)

bookmark_name (str): name for resulting book mark file. Default value is ‘passing_results’ filter_bookmark (str): name of bookmark to perform filtering over options_dict (dict): write options as a dict return_inter (bool): return an iterable of all of the filtering results

Returns:

number of ligands passing filter iter (optional): an iterable of all of the filtering results

Return type:

int

finalize_write(): Finalize database write by creating interaction tables and setting database version

find_similar_ligands(query_ligname: str)

Find ligands in cluster with query_ligname

Parameters:: query_ligname (str) – name of the ligand in the ligand table to look for similars to
Returns:: number of ligands that are similar
Return type:: int

static generate_config_file_template()

Outputs to “config.json in current working directory if to_file = true, else it returns the dict of default option values used for API (for command line a few more options are included that are always used explicitly when using API)

Parameters:: to_file (bool) – whether to produce the template as a json string or as a file “config.json”
Returns:: file name of config file or json string with template including default values
Return type:: str

get_bookmark_names()

Method to retrieve all bookmark names in a database

Returns:: of all bookmarks in a database
Return type:: list

static get_options_info() → dict: Gets names, default values, and meta data for all Ringtail options.

get_plot_data(bookmark_name: str = None)

Get ligand efficiency and energy for all docking data and for ligands that passed filtering in specified bookmark. Each tuple in the respective lists contains docking_score, leff, pose_id, and ligand name.

Parameters:: bookmark_name (str)
Returns:: [all_data], [filtered_data]
Return type:: list(tuple), list(tuple)

get_previous_filter_data(outfields=None, bookmark_name=None, log_file=None)

Get data requested in self.out_opts[‘outfields’] from the results bookmark of a previous filtering

Parameters:

outfields (str) – use outfields as described in RingtailOptions > StorageOptions
bookmark_name (str) – bookmark for which the filters were used

ligands_rdkit_mol(bookmark_name=None, write_nonpassing=False) → dict

Creates a dictionary of RDKit mols of all ligands specified from a bookmark, either excluding (default) or including those ligands that did not pass the filter(s).

Parameters:

bookmark_name (str, optional) – Option to run over specified bookmark other than that just used for filtering
write_nonpassing (bool, optional) – Option to include non-passing poses for passing ligands

Returns:

containing ligand names, RDKit mols, flexible residue bols, and other ligand properties

Return type:

all_mols (dict)

plot(save=True, bookmark_name: str = None, return_fig_handle: bool = False)

Get data needed for creating Ligand Efficiency vs Energy scatter plot from storageManager. Call OutputManager to create plot. Option to save the plot and close it immediately, or keep it open and save it manually later.

Parameters:

save (bool) – whether to save plot to cd. Will save and close figure
bookmark_name (str) – bookmark from which to fetch filtered data to plot
return_fig_handle (bool) – use to return a handle to the matplotlib figure instead of saving or showing figure

Returns:

will not show figure if returning figure handle

Return type:

matplotlib.pyplot.figure (optional)

produce_summary(columns=['docking_score', 'leff'], percentiles=[1, 10]) → None

Print summary of data in storage to sdout

Parameters:

columns (list(str)) – data columns used to prepare summary
percentiles (list(int)) – cutoff percentiles for the summary

save_receptor(receptor_file)

Add receptor to database.

Parameters:: receptor_file (str) – path to receptor file

set_filters(eworst=None, ebest=None, leworst=None, lebest=None, score_percentile=None, le_percentile=None, vdw_interactions=None, hb_interactions=None, reactive_interactions=None, hb_count=None, react_any=None, max_miss=None, ligand_name=None, ligand_operator=None, ligand_substruct=None, ligand_substruct_pos=None, ligand_max_atoms=None, dict: dict = None)

Create a filter object containing all numerical and string filters.

Parameters:

eworst (float) – specify the worst energy value accepted
ebest (float) – specify the best energy value accepted
leworst (float) – specify the worst ligand efficiency value accepted
lebest (float) – specify the best ligand efficiency value accepted
score_percentile (float) – specify the worst energy percentile accepted. Express as percentage e.g. 1 for top 1 percent.
le_percentile (float) – specify the worst ligand efficiency percentile accepted. Express as percentage e.g. 1 for top 1 percent.
vdw_interactions (list[tuple]) – define van der Waals interactions with residue as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [(‘A:VAL:279:’, True), (‘A:LYS:162:’, True)] -> [(‘chain:resname:resid:atomname’, <wanted (bool)>), (‘chain:resname:resid:atomname’, <wanted (bool)>)]
hb_interactions (list[tuple]) – define HB (ligand acceptor or donor) interaction as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [(‘A:VAL:279:’, True), (‘A:LYS:162:’, True)] -> [(‘chain:resname:resid:atomname’, <wanted (bool)>), (‘chain:resname:resid:atomname’, <wanted (bool)>)]
reactive_interactions (list[tuple]) – check if ligand reacted with specified residue as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [(‘A:VAL:279:’, True), (‘A:LYS:162:’, True)] -> [(‘chain:resname:resid:atomname’, <wanted (bool)>), (‘chain:resname:resid:atomname’, <wanted (bool)>)]
hb_count (list[tuple]) – accept ligands with at least the requested number of HB interactions. If a negative number is provided, then accept ligands with no more than the requested number of interactions. E.g., [(‘hb_count’, 5)]
react_any (bool) – check if ligand reacted with any residue
max_miss (int) – Will compute all possible combinations of interaction filters excluding up to max_miss numer of interactions from given set. Default will only return union of poses interaction filter combinations. Use with ‘enumerate_interaction_combs’ for enumeration of poses passing each individual combination of interaction filters.
ligand_name (list[str]) – specify ligand name(s). Will combine name filters with OR, e.g., [“lig1”, “lig2”]
ligand_substruct (list[str]) – SMARTS, index of atom in SMARTS, cutoff dist, and target XYZ coords, e.g., [“ccc”, “CN”]
ligand_substruct_pos (list[str]) – SMARTS pattern(s) for substructure matching, e.g., [‘”[Oh]C” 0 1.2 -5.5 10.0 15.5’] -> [“smart_string index_of_positioned_atom cutoff_distance x y z”]
ligand_max_atoms (int) – Maximum number of heavy atoms a ligand may have
ligand_operator (str) – logical join operator for multiple SMARTS (default: OR), either AND or OR
dict (dict) – dictionary of one or more of the above args, is overwritten by individual args

set_output_options(log_file: str = None, export_sdf_path: str = None, enumerate_interaction_combs: bool = None, dict: dict = None)

Creates output options object that holds attributes related to reading and outputting results. Will assign log_file name and export_sdf_path to the output_manager object.

Parameters:

log_file (str) – by default, results are saved in “output_log.txt”; if this option is used, ligands and requested info passing the filters will be written to specified file
export_sdf_path (str) – specify the path where to save poses of ligands passing the filters (SDF format); if the directory does not exist, it will be created; if it already exist, it will throw an error, unless the –overwrite is used NOTE: the log file will be automatically saved in this path. Ligands will be stored as SDF files in the order specified.
enumerate_interaction_combs (bool) – When used with max_miss > 0, will log ligands/poses passing each separate interaction filter combination as well as union of combinations. Can significantly increase runtime.
dict (dict) – dictionary of one or more of the above args, is overwritten by individual args

set_resultsman_attributes(store_all_poses: bool = None, max_poses: int = None, add_interactions: bool = None, interaction_tolerance: float = None, interaction_cutoffs: list = None, max_proc: int = None, dict: dict = None)

Create results_manager_options object if needed, sets options, and assigns them to the results manager object.

Parameters:

store_all_poses (bool) – store all ligand poses, does it take precedence over max poses?
max_poses (int) – how many poses to save (ordered by soem score?)
add_interactions (bool) – add ligand-receptor interaction data, only in vina mode
interaction_tolerance (float) – longest ångström distance that is considered interaction?
interaction_cutoffs (list) – ångström distance cutoffs for x and y interaction
max_proc (int) – max number of computer processors to use for file reading
dict (dict) – dictionary of one or more of the above args, is overwritten by individual args

set_storageman_attributes(filter_bookmark: str = None, duplicate_handling: str = None, overwrite: bool = None, order_results: str = None, outfields: str = None, output_all_poses: str = None, mfpt_cluster: float = None, interaction_cluster: float = None, bookmark_name: str = None, dict: dict = None)

Create storage_manager_options object if needed, sets options, and assigns them to the storage manager object.

Parameters:

filter_bookmark (str) – Perform filtering over specified bookmark. (in output group in CLI)
duplicate_handling (str, options) – specify how duplicate Results rows should be handled when inserting into database. Options are “ignore” or “replace”. Default behavior will allow duplicate entries.
overwrite (bool) – by default, if a log file exists, it doesn’t get overwritten and an error is returned; this option enable overwriting existing log files. Will also overwrite existing database
order_results (str) – Stipulates how to order the results when written to the log file. By default will be ordered by order results were added to the database. ONLY TAKES ONE OPTION.” “available fields are: ” ‘“e” (docking_score), ‘ ‘“le” (ligand efficiency), ‘ ‘“delta” (delta energy from best pose), ‘ ‘“ref_rmsd” (RMSD to reference pose), ‘ ‘“e_inter” (intermolecular energy), ‘ ‘“e_vdw” (van der waals energy), ‘ ‘“e_elec” (electrostatic energy), ‘ ‘“e_intra” (intermolecular energy), ‘ ‘“n_interact” (number of interactions), ‘ ‘“rank” (rank of ligand pose), ‘ ‘“run” (run number for ligand pose), ‘ ‘“hb” (hydrogen bonds); ‘
outfields (str) – defines which fields are used when reporting the results (to stdout and to the log file); fields are specified as comma-separated values, e.g. “–outfields=e,le,hb”; by default, docking_score (energy) and ligand name are reported; ligand always reported in first column available fields are: ‘ ‘“Ligand_name” (Ligand name), ‘ ‘“e” (docking_score), ‘ ‘“le” (ligand efficiency), ‘ ‘“delta” (delta energy from best pose), ‘ ‘“ref_rmsd” (RMSD to reference pose), ‘ ‘“e_inter” (intermolecular energy), ‘ ‘“e_vdw” (van der waals energy), ‘ ‘“e_elec” (electrostatic energy), ‘ ‘“e_intra” (intermolecular energy), ‘ ‘“n_interact” (number of interactions), ‘ ‘“ligand_smile” , ‘ ‘“rank” (rank of ligand pose), ‘ ‘“run” (run number for ligand pose), ‘ ‘“hb” (hydrogen bonds), ‘ ‘“receptor” (receptor name); ‘ “Fields are printed in the order in which they are provided. Ligand name will always be returned and will be added in first position if not specified.
output_all_poses (bool) – By default, will output only top-scoring pose passing filters per ligand. This flag will cause each pose passing the filters to be logged.
mfpt_cluster (float) – Cluster filered ligands by Tanimoto distance of Morgan fingerprints with Butina clustering and output ligand with lowest ligand efficiency from each cluster. Default clustering cutoff is 0.5. Useful for selecting chemically dissimilar ligands.
interaction_cluster (float) – Cluster filered ligands by Tanimoto distance of interaction fingerprints with Butina clustering and output ligand with lowest ligand efficiency from each cluster. Default clustering cutoff is 0.5. Useful for enhancing selection of ligands with diverse interactions.
bookmark_name (str) – name for resulting book mark file. Default value is “passing_results”
dict (dict) – dictionary of one or more of the above args, is overwritten by individual args

update_database_version(consent=False, new_version='2.0.0'): Method to update database version from earlier versions to either 1.1.0 or 2.0.0

write_flexres_pdb(receptor_polymer, ligname: str, filename: str, bookmark_name: str = None)

Writes a receptor pdb with flexible residues based on the ligand provided

Parameters:

receptor_polymer (Polymer) – version of receptor produced by meeko
ligname (str) – ligand name for which the receptor flexible residue info should be collected
filename (str) – name of the output pdb, extension is optional, will default to ‘.pdb’
bookmark_name (str, optional) – will use last used bookmark if not specified, will not work in a db without any filtering performed

write_molecule_sdfs(sdf_path: str = None, all_in_one: bool = True, bookmark_name: str = None, write_nonpassing: bool = None)

Have output manager write molecule sdf files for passing results in given results bookmark

Parameters:

sdf_path (str, optional) – Optional path existing or to be created in cd where SDF files will be saved
all_in_one (bool, optional) – If True will write all molecules to one SDF (separated by $$$$), if False will write one molecule pre SDF
bookmark_name (str, optional) – Option to run over specified bookmark other than that just used for filtering
write_nonpassing (bool, optional) – Option to include non-passing poses for passing ligands

Raises:

StorageError – if bookmark or data not found

exception ringtail.StorageError: Bases: Exception

class ringtail.StorageManager

Bases: object

check_passing_bookmark_exists(bookmark_name: str = None)

Checks if bookmark name is in database

Parameters:: bookmark_name (str, optional) – name of bookmark name to check if exist, or else will use storageman bookmark_name attribute
Returns:: indicates if bookmark_name exists in the current database
Return type:: bool

check_storage_compatibility()

Checks if chosen storage type has been implemented

Parameters:: storage_type (str) – name of the storage type
Raises:: NotImplementedError – raised if seelected storage type has not been implemented
Returns:: of implemented storage type
Return type:: class

close_storage(attached_db=None, vacuum=False)

Close connection to database

Parameters:

attached_db (str, optional) – name of attached DB (not including file extension)
vacuum (bool, optional) – indicates that database should be vacuumed before closing

crossref_filter(new_db: str, bookmark1_name: str, bookmark2_name: str, selection_type='-', old_db=None) → tuple

Selects ligands found or not found in the given bookmark in both current db and new_db. Stores as temp view

Parameters:

new_db (str) – file name for database to attach
bookmark1_name (str) – string for name of first bookmark/temp table to compare
bookmark2_name (str) – string for name of second bookmark to compare
selection_type (str) – “+” or “-” indicating if ligand names should (“+”) or should not “-” be in both databases
old_db (str, optional) – file name for previous database

Returns:

(name of new bookmark (str), number of ligands passing new bookmark (int))

Return type:

tuple

field_to_column_name = {'Ligand_name': 'LigName', 'delta': 'deltas', 'e': 'docking_score', 'e_elec': 'energies_electro', 'e_inter': 'energies_inter', 'e_intra': 'energies_intra', 'e_vdw': 'energies_vdw', 'hb': 'num_hb', 'interactions': 'interactions', 'le': 'leff', 'ligand_smile': 'ligand_smile', 'n_interact': 'nr_interactions', 'rank': 'pose_rank', 'receptor': 'receptor', 'ref_rmsd': 'reference_rmsd', 'run': 'run_number'}

filter_results(all_filters: dict, suppress_output=False) → iter

Generate and execute database queries from given filters.

Parameters:

all_filters (dict) – dict containing all filters. Expects format and keys corresponding to ringtail.Filters().todict()
suppress_output (bool) – prints filtering summary to sdout

Returns:

iterable, such as an sqlite cursor, of passing results

Return type:

iter

finalize_database_write(): Methods to finalize when a database has been written to, and saving the current database schema to the sqlite database.

get_plot_data(bookmark_name: str = None, only_passing=False)

This function is expected to return an ascii plot representation of the results

Parameters:

bookmark_name (str) – name of bookmark for which to fetch passing data. Will use default bookmark name if None. Returns empty list if bookmark does not exist.
only_passing (bool) – Only return data for passing ligands. Will return empty list for all data.

Returns:

cursors as (<all data cursor>, <passing data cursor>)

Return type:

tuple

insert_data(results_array, ligands_array, interaction_list, receptor_array=[], insert_receptor=False)

Inserts data from all arrays returned from results manager.

Parameters:

results_array (list) – list of data to be stored in Results table
ligands_array (list) – list of data to be stored in Ligands table
interaction_list (list) – list of data to be stored in interaction tables
receptor_array (list) – list of data to be stored in Receptors table
insert_receptor (bool, optional) – flag indicating that receptor info should inserted

insert_interactions(Pose_IDs: list, interactions_list, duplicates)

Takes list of interactions, inserts into database

Parameters:

Pose_IDs (list(int)) – list of pose ids assigned while writing the current results to database
interactions_list (list) – List of tuples for interactions in form (“type”, “chain”, “residue”, “resid”, “recname”, “recid”)
duplicates (list(Pose_ID)) – any duplicates identified in “insert_results”, if duplicate handling has been specified

prune(): Deletes rows from results, ligands, and interactions in a bookmark if they do not pass filtering criteria

class ringtail.StorageManagerSQLite(db_file: str = None, overwrite: bool = None, order_results: str = None, outfields: str = None, filter_bookmark: str = None, output_all_poses: bool = None, mfpt_cluster: float = None, interaction_cluster: float = None, bookmark_name: str = None, duplicate_handling: str = None)

Bases: StorageManager

SQLite-specific StorageManager subclass

conn

Connection to database

Type:: SQLite.conn

open_cursors

list of cursors that were not closed by the function that created them. Will be closed by close_connection method.

Type:: list

db_file

database name

Type:: str

overwrite

switch to overwrite database if it exists

Type:: bool

order_results

what column name will be used to order results once read

Type:: str

outfields

data fields/columns to include when reading and outputting data

Type:: str

filter_bookmark

name of bookmark that filtering will be performed over

Type:: str

output_all_poses

whether or not to output all poses of a ligand

Type:: bool

mfpt_cluster

distance in ångströms to cluster ligands based on morgan fingerprints

Type:: float

interaction_cluster

distance in ångströms to cluster ligands based on interactions

Type:: float

bookmark_name

name of current bookmark being written to or read from

Type:: str

duplicate_handling

optional attribute to deal with insertion of ligands already in the database

Type:: str

current_bookmark_name

name of last view to have been written to in the database

Type:: str

filtering_window

name of bookmark/view being filtered on

Type:: str

index_columns

Type:: list

view_suffix

current suffix for views

Type:: int

temptable_suffix

current suffix for temporary tables

Type:: int

field_to_column_name

Dictionary for converting ringtail options into DB column names

Type:: dict

bookmark_has_rows(bookmark_name: str) → bool

Method that checks if a given bookmark has any data in it

Parameters:: bookmark_name (str) – view to check
Returns:: True if more than zero rows in bookmark
Return type:: bool

check_ringtaildb_version()

Checks the database version and confirms whether the code base is compatible with it

Returns:: whether or not db is compatible with the code base str: current database versions
Return type:: bool

check_storage_ready(run_mode: str, docking_mode: str, store_all_poses: bool, max_poses: int)

Check that storage is ready before proceeding, and creates new tables if needed

Parameters:

run_mode (str) – if ringtail is ran using cmd line interface or api
docking_mode (str) – what docking engine was used to produce results
store_all_poses (bool) – overrwrites max poses
max_poses (int) – max poses to save to db

Raises:

StorageError –
OptionError – if database options are not compatible

clone(backup_name=None)

Creates a copy of the db

Parameters:: backup_name (str, optional) – name of the cloned database

count_receptors_in_db()

returns number of rows in Receptors table where receptor_object already has blob

Returns:: number of rows in receptors table str: name of receptor if present in table
Return type:: int
Raises:: DatabaseQueryError –

create_bookmark(name, query, temp=False, add_poseID=False, filters={})

Takes name and selection query and creates a bookmark of name. Bookmarks are Ringtail specific views that whose information is stored in the ‘Bookmark’ table. #FIXME bug where ligand filter only results are not added as bookmarks

Parameters:

name (str) – Name for bookmark which will be created
query (str) – SQLite-formated query used to create bookmark
temp (bool, optional) – Flag if bookmark should be temporary
add_poseID (bool, optional) – Add Pose_ID column to bookmark
filters (dict, optional) – a dict of filters used to construct the query

create_bookmark_from_temp_table(temp_table_name, bookmark_name, original_bookmark_name, wanted_list, unwanted_list=[])

Resaves temp bookmark stored in self.current_bookmark_name as new permenant bookmark

Parameters:

bookmark_name (str) – name of bookmark to save last temp bookmark as
original_bookmark_name (str) – name of original bookmark
wanted_list (list) – List of wanted database names
unwanted_list (list, optional) – List of unwanted database names
temp_table_name (str) – name of temporary table

create_temp_table_from_bookmark(): Method that creates a temporary table named “passing_temp”. Please note that this table will be dropped as soon as the database connection closes.

drop_bookmark(bookmark_name: str)

Drops specified bookmark from database

Parameters:: bookmark_name (str) – bookmark to be dropped
Raises:: DatabaseInsertionError –

fetch_bookmark(bookmark_name: str) → Cursor

returns SQLite cursor of all fields in bookmark

Parameters:: bookmark_name (str) – name of bookmark to retrieve
Returns:: cursor of requested view
Return type:: sqlite3.Cursor

fetch_clustered_similars(ligname: str)

Given ligname, returns poseids for similar poses/ligands from previous clustering. User prompted at runtime to choose cluster.

Parameters:

ligname (str) – ligname for ligand to find similarity with

Raises:

ValueError – wrong terminal input
DatabaseQueryError –

fetch_data_for_passing_results() → iter

Will return SQLite cursor with requested data for outfields for poses that passed filter in self.bookmark_name

Returns:: sqlite cursor of data from passing data
Return type:: iter
Raises:: OptionError –

fetch_filters_from_bookmark(bookmark_name: str = None)

Method that will retrieve filter values used to construct bookmark

Parameters:

bookmark_name (str, optional) – can get filter values for given bookmark, or filter values from currently active bookmark in storageman
Returns – dict: containing the filter data

fetch_flexres_info()

fetch flexible residues names and atomname lists

Returns:: (flexible_residues, flexres_atomnames)
Return type:: tuple

fetch_interaction_info_by_index(interaction_idx) → tuple

Returns tuple containing interaction info for given interaction_idx

Parameters:: interaction_idx (int) – interaction index to fetch info for
Returns:: tuple of info for requested interaction
Return type:: tuple

fetch_nonpassing_pose_properties(ligname)

fetch coordinates for poses of ligname which did not pass the filter

Parameters:

ligname (str) – name of ligand to fetch coordinates for

Returns:

SQLite cursor that contains Pose_ID, docking_score, leff, ligand_coordinates,: flexible_res_coordinates, flexible_residues

Return type:

iter

fetch_passing_ligand_output_info() → iter

fetch information required by vsmanager for writing out molecules

Returns:

contains LigName, ligand_smile,: atom_index_map, hydrogen_parents

Return type:

iter

fetch_passing_pose_properties(ligname)

fetch coordinates for poses passing filter for given ligand

Parameters:

ligname (str) – name of ligand to fetch coordinates for

Returns:

SQLite cursor that contains Pose_ID, docking_score, leff, ligand_coordinates,: flexible_res_coordinates, flexible_residues

Return type:

iter

fetch_pose_interactions(Pose_ID) → iter

Fetch all interactions parameters belonging to a Pose_ID

Parameters:: Pose_ID (int) – pose id, 1-1 with Results table
Returns:: of interaction information for given Pose_ID
Return type:: iter

fetch_receptor_object_by_name(rec_name)

Returns Receptor object from database for given rec_name

Parameters:: rec_name (str) – Name of receptor to return object for

Returns: str: receptor object as a string

fetch_receptor_objects()

Returns all Receptor objects from database

Parameters:: rec_name (str) – Name of receptor to return object for
Returns:: of receptor names and objects
Return type:: iter (tuple)

fetch_single_ligand_output_info(ligname) → str

get output information for given ligand

Parameters:: ligname (str) – ligand name
Raises:: DatabaseQueryError –
Returns:: information containing smiles, atom and index mapping, and hydrogen parents
Return type:: str

fetch_single_pose_properties(pose_ID: int) → iter

fetch coordinates for pose given by pose_ID

Parameters:

pose_ID (int) – name of ligand to fetch coordinates for

Returns:

SQLite cursor that contains Pose_ID, docking_score, leff, ligand_coordinates,: flexible_res_coordinates, flexible_residues

Return type:

iter

fetch_summary_data(columns=['docking_score', 'leff'], percentiles=[1, 10]) → dict

Collect summary data for database:

Num Ligands Num stored poses Num unique interactions

min, max, percentiles for columns in columns

Parameters:

columns (list (str)) – columns to be displayed and used in summary
percentiles (list(int)) – percentiles to consider

Returns:

of data summary

Return type:

dict

classmethod format_for_storage(ligand_dict: dict) → tuple

takes file dictionary from the file parser, formats required storage format

Parameters:

ligand_dict (dict) – Dictionary containing data from the fileparser

Returns:

of lists ([result_row_1, result_row_2,…],: ligand_row, [interaction_tuple_1, interaction_tuple_2, …])

Return type:

tuple

get_all_bookmark_names()

Get all bookmarks in sql database as a list of names. Bookmarks are a specific type of sqlite-views whose information is stored in the Bookmarks table.

Returns:: of bookmark names
Return type:: list

get_current_bookmark_name()

returns current bookmark name

Returns:: name of last passing results bookmark used by database
Return type:: str

get_maxmiss_union(total_combinations: int)

Get results that are in union considering max miss

Parameters:: total_combinations (int) – numer of possible combinations
Returns:: of passing results
Return type:: iter

insert_receptor_blob(receptor, rec_name)

Takes object of Receptor class, updates the column in Receptor table

Parameters:

receptor (bytes) – bytes receptor object to be inserted into DB
rec_name (string) – Name of receptor. Used to insert into correct row of DB

Raises:

DatabaseInsertionError – Description

overwrite_storage(): Will drop all tables in the database.

set_bookmark_suffix(suffix)

Sets internal bookmark_suffix variable

Parameters:: suffix (str) – suffix to attached to bookmark-related queries or creation

to_dataframe(requested_data: str, table=True) → pandas.DataFrame

Returns a panda dataframe of table or query given as requested_data

Parameters:

requested_data (str) – String containing SQL-formatted query or table name
table (bool) – Flag indicating if requested_data is table name or not

Returns:

dataframe of requested data

Return type:

pd.DataFrame

update_database_version(new_version, consent=False)

method that updates sqlite database schema 1.0.0 or 1.1.0 to 1.1.0 or 2.0.0

#NOTE: If you created a version 1 database with the duplicate handling option, there is a chance of inconsistent behavior of anything involving interactions as the Pose_ID was not used as an explicit foreign key in db v1.0.0 and v1.1.0.

Parameters:: consent (bool, optional) – variable to ensure consent to update database is explicit
Returns:: bool

exception ringtail.WriteToStorageError: Bases: Exception

class ringtail.Writer(*args: Any, **kwargs: Any)

Bases: Process

This class is a listener that retrieves data from the queue and writes it into datbase

process_data(data_packet)

Breaks up the data in the data_packet to distribute between the different arrays to be inserted in the database.

Parameters:: data_packet (any) – File packet to be processed

run()

Method overload from parent class. This is where the task of this class is performed. Each multiprocess.Process class must have a “run” method which is called by the initialization (see below) with start()

Raises:: WriteToStorageError –

write_to_storage(): Inserting data to the database through the designated storagemanager.

ringtail.parse_single_dlg(fname)

Parse an ADGPU DLG file uncompressed or gzipped

Parameters:

fname (str) – ligand docking result file name

Raises:

ValueError –
FileParsingError –

Returns:

parsed results ready to be inserted in database

Return type:

dict

ringtail.parse_vina_result(data_pointer) → dict

Parser for vina docking results, supporting either pdbqt or gzipped (.gz) files, or with the docking results provided as a string.

Parameters:: data_pointer (any) – either filename or dictionary of string docking results
Returns:: parsed results ready to be inserted in database
Return type:: dict