Ringtail package

Submodules

ringtail.cloptionparser module

class ringtail.cloptionparser.CLOptionParser

Bases: object

Command line option/argument parser. Options and switches are utilized in the script ‘rt_process_vs.py’.

process_mode

operating in ‘write’ or ‘read’ mode

Type:

str

rtcore

ringtail core object initialized with the provided db_file

Type:

RingtailCore

filters

fully parsed and organied optional filters

Type:

dict

file_sources

fully parsed docking results and receptor files

Type:

dict

writeopts

fully parsed arguments related to database writing

Type:

dict

storageopts

fully parsed arguments related to how the storage system behaves

Type:

dict

outputopts

fully parsed arguments related to output and reading from the database

Type:

dict

print_summary

switch to print database summary

Type:

bool

filtering

switch to run filtering method

Type:

bool

plot

switch to plot the data

Type:

bool

export_bookmark_db

switch to export bookmark as a new database

Type:

bool

export_receptor

switch to export receptor information to pdbqt

Type:

bool

pymol

switch to visualize ligands in pymol

Type:

bool

data_from_bookmark

switch to write bookmark data to the output log file

Type:

bool

Raises:

OptionError – Error when an option cannot be parsed correctly

process_options(parsed_opts)

Process and organize command line options to into ringtail options and filter dictionaries and ringtail core attributes

Parameters:

parsed_opts (argparse.Namespace) – arguments provided through the cmdline_parser method.

ringtail.cloptionparser.cmdline_parser(defaults: dict = {})

Parses options provided using the command line. All arguments are first populated with default values. If a config file is provided, these will overwrite default values. Any single arguments provided using the argument parser will overwrite default and config file values.

Parameters:

defaults (dict) – default argument values

ringtail.exceptions module

exception ringtail.exceptions.DatabaseConnectionError

Bases: StorageError

exception ringtail.exceptions.DatabaseInsertionError

Bases: StorageError

exception ringtail.exceptions.DatabaseQueryError

Bases: StorageError

exception ringtail.exceptions.DatabaseTableCreationError

Bases: StorageError

exception ringtail.exceptions.DatabaseViewCreationError

Bases: StorageError

exception ringtail.exceptions.FileParsingError

Bases: Exception

exception ringtail.exceptions.MultiprocessingError

Bases: Exception

exception ringtail.exceptions.NoInputError

Bases: OptionError

exception ringtail.exceptions.OptionError

Bases: Exception

exception ringtail.exceptions.OutputError

Bases: Exception

exception ringtail.exceptions.RTCoreError

Bases: Exception

exception ringtail.exceptions.ResultsProcessingError

Bases: Exception

exception ringtail.exceptions.StorageError

Bases: Exception

exception ringtail.exceptions.WriteToStorageError

Bases: Exception

ringtail.interactions module

class ringtail.interactions.InteractionFinder(rec_string, interaction_cutoff_radii)

Bases: object

Class for handling and calculating ligand-receptor interactions.

rec_string

string describing the receptor

Type:

str

interaction_cutoff_radii

cutoff for interactions of hydrogen bonds and VDW interactions, in ångströms

Type:

list(float)

find_pose_interactions(lig_atomtype_list: list, lig_coordinates: list) dict

Method that identifies interactions for a pose within th given cutoff distances in the main class.

Parameters:
  • lig_atomtype_list (list) – list of atoms in the ligand

  • lig_coordinates (list) – coordinates for the atoms in the ligand

Returns:

all interaction details for a given ligand pose

Return type:

dict

ringtail.mpmanager module

class ringtail.mpmanager.MPManager(docking_mode, max_poses, interaction_tolerance, store_all_poses, add_interactions, interaction_cutoffs, max_proc, storageman, storageman_class, chunk_size, target, receptor_file, file_pattern=None, file_sources=None, string_sources=None)

Bases: object

Manager that orchestrates paralell processing of docking results data, using one of the supported multiprocessors.

docking_mode

describes what docking engine was used to produce the results

Type:

str

max_poses

max number of poses to store for each ligand

Type:

int

interaction_tolerance

Will add the interactions for poses within some tolerance RMSD range of the top pose in a cluster to that top pose.”

Type:

float

store_all_poses

Store all poses from docking results

Type:

bool

add_interactions

find and save interactions between ligand poses and receptor

Type:

bool

interaction_cutoffs

cutoff for interactions of hydrogen bonds and VDW interactions, in ångströms

Type:

list(float)

max_proc

Maximum number of processes to create during parallel file parsing.

Type:

int

storageman

storageman object

Type:

StorageManager

storageman_class

storagemanager child class/database type

Type:

StorageManager

chunk_size

how many tasks ot send to a processor at the time

Type:

int

target

name of receptor

Type:

str

receptor_file

file path to receptor

Type:

str

file_pattern

file pattern to look for if recursively finding results files to process

Type:

str, optional

file_sources

RingtailOption object that holds all attributes related to results files

Type:

InputFiles, optional

string_sources

RingtailOption object that holds all attributes related to results strings

Type:

InputStrings, optional

num_files

number of files processed at any given time

Type:

int

process_results()

Processes results data (files or string sources) by adding them to the queue and starting their processing in multiprocess.

ringtail.mpreaderwriter module

class ringtail.mpreaderwriter.DockingFileReader(*args: Any, **kwargs: Any)

Bases: Process

This class is the individual worker for processing docking results. One instance of this class is instantiated for each available processor.

queueIn

current queue for the processor/file reader

Type:

multiprocess.Queue

queueOut

queue for the processor/file reader after adding or removing an item

Type:

multiprocess.Queue

pipe_conn

pipe connection to the reader

Type:

multiprocess.Pipe

storageman

storageman object

Type:

StorageManager

storageman_class

storagemanager child class/database type

Type:

StorageManager

docking_mode

describes what docking engine was used to produce the results

Type:

str

max_poses

max number of poses to store for each ligand

Type:

int

interaction_tolerance

Will add the interactions for poses within some tolerance RMSD range of the top pose in a cluster to that top pose.”

Type:

float

store_all_poses

Store all poses from docking results

Type:

bool

add_interactions

find and save interactions between ligand poses and receptor

Type:

bool

interaction_cutoffs

cutoff for interactions of hydrogen bonds and VDW interactions, in ångströms

Type:

list(float)

target

receptor name

Type:

str

run()

Method overload from parent class .This is where the task of this class is performed. Each multiprocess.Process class must have a “run” method which is called by the initialization (see below) with start()

Raises:
  • NotImplementedError – if parser for specific docking result type is not implemented

  • FileParsingError

class ringtail.mpreaderwriter.Writer(*args: Any, **kwargs: Any)

Bases: Process

This class is a listener that retrieves data from the queue and writes it into datbase

process_data(data_packet)

Breaks up the data in the data_packet to distribute between the different arrays to be inserted in the database.

Parameters:

data_packet (any) – File packet to be processed

run()

Method overload from parent class. This is where the task of this class is performed. Each multiprocess.Process class must have a “run” method which is called by the initialization (see below) with start()

Raises:

WriteToStorageError

write_to_storage()

Inserting data to the database through the designated storagemanager.

ringtail.outputmanager module

class ringtail.outputmanager.OutputManager(log_file=None, export_sdf_path=None)

Bases: object

Class for creating outputs, can be a context manager to handle log files

log_file

name for log file

Type:

str

export_sdf_path

path for exporting SDF molecule files

Type:

str

_log_open

if log file is open or not

Type:

bool

close_logfile()

Closes the log file properly and reset file pointer to filename

log_num_passing_ligands(number_passing_ligands: int)

Write the number of ligands which pass given filter to log file

Parameters:

number_passing_ligands (int) – number of ligands that passed filter

Raises:

OutputError

open_logfile(write_filters_header=True)

Opens log file and creates it if needed

Parameters:

write_filters_header (bool) – only used because one method does not take the same headers

Raises:

OutputError

plot_all_data(xdata, ydata, num_of_bins: int = 100)

Takes dictionary of binned data where key is the coordinates of the bin and value is the number of points in that bin. Adds to scatter plot colored by value

Parameters:
  • xdata (list) – list of x axis data (needs to be same length as ydata)

  • ydata (list) – list of y axis data (needs to be same length as xdata)

  • num_of_bins (int) – number of bins to organize data in

Returns:

matplotlib.pyplot.figure

Raises:

OutputError

plot_single_points(x: list, y: list, markersize: int = 20, color='crimson')

Add points to scatter plot with given x and y coordinates and color.

Parameters:
  • x (float) – x coordinate

  • y (float) – y coordinate

  • color (str, optional) – Color for point. Default black.

Raises:

OutputError

save_scatterplot()

Saves current figure as scatter.png

Raises:

OutputError

scatter_hist(x, y, z, ax_histx, ax_histy)

Makes scatterplot with a histogram on each axis

Parameters:
  • x (list) – x coordinates for data

  • y (list) – y coordinates for data

  • z (list) – z coordinates for data

  • ax (matplotlib.axis) – scatterplot axis

  • ax_histx (matplotlib.axis) – x histogram axis

  • ax_histy (matplotlib.axis) – y histogram axis

Raises:

OutputError

write_filter_log(lines)

Writes lines from results iterable into log file

Parameters:

lines (iterable) – Iterable with tuples of data for writing into log

Raises:

OutputError

Returns:

number of ligands passing that are written to log file

Return type:

int

write_filters_to_log(filters_dict, included_interactions, additional_info='')

Takes dictionary of filters, formats as string and writes to log file

Parameters:
  • filters_dict (dict) – dictionary with filtering options

  • included_interactions (list) – types of interactions to include in the filtering

  • additional_info (str) – any additional information to write to top of log file

Raises:

OutputError

write_find_similar_header(query_ligname, cluster_name)

Properly formats header for the log file find_similar_ligands

write_maxmiss_union_header()

Properly formats header for the log file if using max_miss and enumerate_interaction_combs

write_out_mol(filename, mol, flexres_mols, properties)

Writes out given mol as sdf. Will create the specified sdf folder in current working directory if needed.

Parameters:
  • filename (str) – name of SDF file that will be written to

  • mol (RDKit.Chem.Mol) – RDKit molobject to be written to SDF

  • flexres_mols (list) – dictionary of rdkit molecules for flexible residues

  • properties (dict) – dictionary of list of properties to add to mol before writing

Raises:

OutputError

write_receptor_pdbqt(recname: str, receptor_compbytes)

Writes a pdbqt file from receptor “blob”

Parameters:
  • recname (str) – name of receptor to use in output filename

  • receptor_compbytes (blob) – receptor blob

write_results_bookmark_to_log(bookmark_name)

Write the name of the result bookmark into log

Parameters:

bookmark_name (str) – name of current results’ bookmark in db

Raises:

OutputError

ringtail.parsers module

ringtail.parsers.parse_single_dlg(fname)

Parse an ADGPU DLG file uncompressed or gzipped

Parameters:

fname (str) – ligand docking result file name

Raises:
Returns:

parsed results ready to be inserted in database

Return type:

dict

ringtail.parsers.parse_vina_result(data_pointer) dict

Parser for vina docking results, supporting either pdbqt or gzipped (.gz) files, or with the docking results provided as a string.

Parameters:

data_pointer (any) – either filename or dictionary of string docking results

Returns:

parsed results ready to be inserted in database

Return type:

dict

ringtail.parsers.receptor_pdbqt_parser(fname)

Parse receptor PDBQT file into list of dictionary with dictionary containing data for a single atom line

Parameters:

fname (string) – name of receptor pdbqt file to parse

ringtail.receptormanager module

class ringtail.receptormanager.ReceptorManager

Bases: object

Class with methods dealing with formatting of receptor information

static blob2str(receptor_blob)

Creates blob of compresser receptor file info

Parameters:

receptor_blob (blob) – zipped receptor blob

Returns:

receptor string

Return type:

str

static make_receptor_blobs(file_list)

Creates compressed receptor info

Parameters:

file_list (str) – path to receptor file

Returns:

compressed receptor

Return type:

blob

ringtail.resultsmanager module

class ringtail.resultsmanager.ResultsManager(docking_mode: str = None, max_poses: int = None, interaction_tolerance: float = None, store_all_poses: bool = None, add_interactions: bool = None, interaction_cutoffs: list = None, max_proc: int = None, storageman: StorageManager = None, storageman_class: StorageManager = None, chunk_size: int = 1, parser_manager: str = 'multiprocess', file_sources=None, string_sources=None)

Bases: object

Class that handles the processing of the results, including passing on the docking results to the appropriate paralell/multi-processing unit

Parameters:
  • max_poses (int) – max number of poses to store for each ligand

  • interaction_tolerance (float) – Will add the interactions for poses within some tolerance RMSD range of the top pose in a cluster to that top pose.”

  • store_all_poses (bool) – Store all poses from docking results

  • add_interactions (bool) – find and save interactions between ligand poses and receptor

  • interaction_cutoffs (list(float)) – cutoff for interactions of hydrogen bonds and VDW interactions, in ångströms

  • max_proc (int) – Maximum number of processes to create during parallel file parsing.

  • storageman (StorageManager) – storageman object

  • storageman_class (StorageManager) – storagemanager child class/database type

  • chunk_size (int) – how many tasks ot send to a processor at the time

  • parser_manager (str, optional) – what paralellization or multiprocessing package to use

  • file_sources (InputFiles, optional) – given file sources including the receptor file

  • string_sources (InputStrings, optional) – given string sources including the path to the receptor

Raises:

ResultsProcessingError

process_docking_data()

Processes docking data in the form of files or strings

Raises:

ResultsProcessingError – if no file or string sources are provided, or if both are provided

ringtail.ringtailcore module

class ringtail.ringtailcore.RingtailCore(db_file: str = 'output.db', storage_type: str = 'sqlite', docking_mode: str = 'dlg', logging_level: str = 'WARNING')

Bases: object

Core class for coordinating different actions on virtual screening including adding results to storage, filtering and clusteirng, and outputting data as rdkit molecules, plotting docking results, and visualizing select ligands in pymol.

db_file

name of database file being operated on

Type:

str

docking_mode

specifies what docking mode has been used for the results in the database

Type:

str

storageman

Interface module with database

Type:

StorageManager

resultsman

Module to deal with results processing before adding to database

Type:

ResultsManager

outputman

Manager for output tasks of log-writting, plotting, ligand SDF writing, starting pymol sessions

Type:

OutputManager

filters

object holding all optional filters

Type:

Filters

_run_mode

refers to whether ringtail is ran from the command line or through direct API use, where the former is more restrictive

Type:

str

add_results_from_files(file: str = None, file_path: str = None, file_list: str = None, file_pattern: str = None, recursive: bool = None, receptor_file: str = None, save_receptor: bool = None, filesources_dict: dict = None, duplicate_handling: str = None, overwrite: bool = None, store_all_poses: bool = None, max_poses: int = None, add_interactions: bool = None, interaction_tolerance: float = None, interaction_cutoffs: list = None, max_proc: int = None, options_dict: dict = None, finalize: bool = True)

Call storage manager to process result files and add to database. Creates or adds to an existing a database. Options can be provided as a dict or as individual options. If both are provided, individual options will overwrite those from the dictionary.

Parameters:
  • (str (file_list) – list(str)): ligand result file

  • optional – list(str)): ligand result file

  • (str – list(str)): list of folders containing one or more result files

  • optional – list(str)): list of folders containing one or more result files

  • (str – list(str)): list of ligand result file(s)

  • optional – list(str)): list of ligand result file(s)

  • file_pattern (str) – file pattern to use with recursive search in a file_path, “.dlg” for AutoDock-GDP and “.pdbqt” for vina

  • recursive (bool) – used to recursively search file_path for folders inside folders

  • receptor_file (str) – string containing the receptor .pdbqt

  • save_receptor (bool) – whether or not to store the full receptor details in the database (needed for some things)

  • filesources_dict (dict) – file sources already as an object

  • duplicate_handling (str, options) – specify how duplicate Results rows should be handled when inserting into database. Options are “ignore” or “replace”. Default behavior will allow duplicate entries.

  • store_all_poses (bool) – store all ligand poses, does it take precedence over max poses?

  • max_poses (int) – how many poses to save (ordered by soem score?)

  • add_interactions (bool) – add ligand-receptor interaction data, only in vina mode

  • interaction_tolerance (float) – longest ångström distance that is considered interaction?

  • interaction_cutoffs (list) – ångström distance cutoffs for x and y interaction

  • max_proc (int) – max number of computer processors to use for file reading

  • options_dict (dict) – write options as a dict

Raises:

OptionError

add_results_from_vina_string(results_strings: dict = None, receptor_file: str = None, save_receptor: bool = None, resultsources_dict: dict = None, duplicate_handling: str = None, overwrite: bool = None, store_all_poses: bool = None, max_poses: int = None, add_interactions: bool = None, interaction_cutoffs: list = None, max_proc: int = None, options_dict: dict = None, finalize: bool = True)

Call storage manager to process the given vina output string and add to database. Options can be provided as a dict or as individual options. Creates or adds to an existing a database.

Parameters:
  • results_string (dict) – string containing the ligand identified and docking results as a dictionary

  • receptor_file (str) – string containing the receptor .pdbqt

  • save_receptor (bool) – whether or not to store the full receptor details in the database (needed for some things)

  • resultsources_dict (dict) – file sources already as an object

  • duplicate_handling (str, options) – specify how duplicate Results rows should be handled when inserting into database. Options are “ignore” or “replace”. Default behavior will allow duplicate entries.

  • store_all_poses (bool) – store all ligand poses, does it take precedence over max poses?

  • max_poses (int) – how many poses to save (ordered by soem score?)

  • add_interactions (bool) – add ligand-receptor interaction data, only in vina mode

  • interaction_cutoffs (list) – ångström distance cutoffs for x and y interaction

  • max_proc (int) – max number of computer processors to use for file reading

  • options_dict (dict) – write options as a dict

Raises:

OptionError

static default_dict() dict

Creates a dict of all Ringtail options.

Returns:

json string with options

Return type:

str

display_pymol(bookmark_name=None)

Launch pymol session and plot of LE vs docking score. Displays molecules when clicked.

Parameters:

bookmark_name (str) – bookmark name to use in pymol. ‘None’ uses the whole db?

property docking_mode

Private method to retrieve docking mode

Returns:

docking mode

Return type:

str

drop_bookmark(bookmark_name: str)

Drops specified bookmark from the database

Parameters:

bookmark_name (str) – name of bookmark to be dropped.

export_bookmark_db(bookmark_name: str = None) str

Export database containing data from bookmark

Parameters:

bookmark_name (str) – name for bookmark_db

Returns:

name of the new, exported database

Return type:

str

export_csv(requested_data: str, csv_name: str, table=False)

Get requested data from database, export as CSV

Parameters:
  • requested_data (str) – Table name or SQL-formatted query

  • csv_name (str) – Name for exported CSV file

  • table (bool) – flag indicating is requested data is a table name

export_receptors()

Export receptor in database to pdbqt

filter(eworst=None, ebest=None, leworst=None, lebest=None, score_percentile=None, le_percentile=None, vdw_interactions=None, hb_interactions=None, reactive_interactions=None, hb_count=None, react_any=None, max_miss=None, ligand_name=None, ligand_operator=None, ligand_substruct=None, ligand_substruct_pos=None, ligand_max_atoms=None, filters_dict: dict | None = None, enumerate_interaction_combs: bool = False, output_all_poses: bool = None, mfpt_cluster: float = None, interaction_cluster: float = None, log_file: str = None, overwrite: bool = None, order_results: str = None, outfields: str = None, bookmark_name: str = None, filter_bookmark: str = None, options_dict: dict | None = None, return_iter=False)

Prepare list of filters, then hand it off to storageman to perform filtering. Creates log of all ligand docking results that passes.

Parameters:
  • Filters – eworst (float): specify the worst energy value accepted ebest (float): specify the best energy value accepted leworst (float): specify the worst ligand efficiency value accepted lebest (float): specify the best ligand efficiency value accepted score_percentile (float): specify the worst energy percentile accepted. Express as percentage e.g. 1 for top 1 percent. le_percentile (float): specify the worst ligand efficiency percentile accepted. Express as percentage e.g. 1 for top 1 percent. vdw_interactions (list[tuple]): define van der Waals interactions with residue as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [(‘A:VAL:279:’, True), (‘A:LYS:162:’, True)] -> [(‘chain:resname:resid:atomname’, <wanted (bool)>), (‘chain:resname:resid:atomname’, <wanted (bool)>)] hb_interactions (list[tuple]): define HB (ligand acceptor or donor) interaction as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [(‘A:VAL:279:’, True), (‘A:LYS:162:’, True)] -> [(‘chain:resname:resid:atomname’, <wanted (bool)>), (‘chain:resname:resid:atomname’, <wanted (bool)>)] reactive_interactions (list[tuple]): check if ligand reacted with specified residue as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [(‘A:VAL:279:’, True), (‘A:LYS:162:’, True)] -> [(‘chain:resname:resid:atomname’, <wanted (bool)>), (‘chain:resname:resid:atomname’, <wanted (bool)>)] hb_count (list[tuple]): accept ligands with at least the requested number of HB interactions. If a negative number is provided, then accept ligands with no more than the requested number of interactions. E.g., [(‘hb_count’, 5)] react_any (bool): check if ligand reacted with any residue max_miss (int): Will compute all possible combinations of interaction filters excluding up to max_miss numer of interactions from given set. Default will only return union of poses interaction filter combinations. Use with ‘enumerate_interaction_combs’ for enumeration of poses passing each individual combination of interaction filters. ligand_name (list[str]): specify ligand name(s). Will combine name filters with OR, e.g., [[“lig1”, “lig2”]] ligand_substruct (list[str]): SMARTS, index of atom in SMARTS, cutoff dist, and target XYZ coords, e.g., [[“ccc”, “CN”]] ligand_substruct_pos (list[list[type]]): SMARTS pattern(s) for substructure matching, e.g., [[“[Oh]C”, 0, 1.2, -5.5, 10.0, 15.5]] -> [[“smart_string”, index_of_positioned_atom, cutoff_distance, x, y, z]] ligand_max_atoms (int): Maximum number of heavy atoms a ligand may have ligand_operator (str): logical join operator for multiple SMARTS (default: OR), either AND or OR filters_dict (dict): provide filters as a dictionary

  • options (Ligand results) –

    enumerate_interaction_combs (bool): When used with max_miss > 0, will log ligands/poses passing each separate interaction filter combination as well as union of combinations. Can significantly increase runtime. output_all_poses (bool): By default, will output only top-scoring pose passing filters per ligand. This flag will cause each pose passing the filters to be logged. mfpt_cluster (float): Cluster filtered ligands by Tanimoto distance of Morgan fingerprints with Butina clustering and output ligand with lowest ligand efficiency from each cluster. Default clustering cutoff is 0.5. Useful for selecting chemically dissimilar ligands. interaction_cluster (float): Cluster filtered ligands by Tanimoto distance of interaction fingerprints with Butina clustering and output ligand with lowest ligand efficiency from each cluster. Default clustering cutoff is 0.5. Useful for enhancing selection of ligands with diverse interactions. log_file (str): by default, results are saved in output_log.txt; if this option is used, ligands and requested info passing the filters will be written to specified file overwrite (bool): by default, if a log file exists, it doesn’t get overwritten and an error is returned; this option enable overwriting existing log files. Will also overwrite existing database order_results (str): Stipulates how to order the results when written to the log file. By default will be ordered by order results were added to the database. ONLY TAKES ONE OPTION. Available fields are:

    ”e” (docking_score), “le” (ligand efficiency), “delta” (delta energy from best pose), “ref_rmsd” (RMSD to reference pose), “e_inter” (intermolecular energy), “e_vdw” (van der waals energy), “e_elec” (electrostatic energy), “e_intra” (intermolecular energy), “n_interact” (number of interactions), “rank” (rank of ligand pose), “run” (run number for ligand pose), “hb” (hydrogen bonds);

    outfields (str): defines which fields are used when reporting the results (to stdout and to the log file); fields are specified as comma-separated values, e.g. –outfields=e,le,hb; by default, docking_score (energy) and ligand name are reported; ligand always reported in first column available fields are:

    ”Ligand_name” (Ligand name), “e” (docking_score), “le” (ligand efficiency), “delta” (delta energy from best pose), “ref_rmsd” (RMSD to reference pose), “e_inter” (intermolecular energy), “e_vdw” (van der waals energy), “e_elec” (electrostatic energy), “e_intra” (intermolecular energy), “n_interact” (number of iteractions), “ligand_smile” , “rank” (rank of ligand pose), “run” (run number for ligand pose), “hb” (hydrogen bonds), “receptor” (receptor name)

    bookmark_name (str): name for resulting book mark file. Default value is ‘passing_results’ filter_bookmark (str): name of bookmark to perform filtering over options_dict (dict): write options as a dict return_inter (bool): return an iterable of all of the filtering results

Returns:

number of ligands passing filter iter (optional): an iterable of all of the filtering results

Return type:

int

finalize_write()

Finalize database write by creating interaction tables and setting database version

find_similar_ligands(query_ligname: str)

Find ligands in cluster with query_ligname

Parameters:

query_ligname (str) – name of the ligand in the ligand table to look for similars to

Returns:

number of ligands that are similar

Return type:

int

static generate_config_file_template()

Outputs to “config.json in current working directory if to_file = true, else it returns the dict of default option values used for API (for command line a few more options are included that are always used explicitly when using API)

Parameters:

to_file (bool) – whether to produce the template as a json string or as a file “config.json”

Returns:

file name of config file or json string with template including default values

Return type:

str

get_bookmark_names()

Method to retrieve all bookmark names in a database

Returns:

of all bookmarks in a database

Return type:

list

static get_options_info() dict

Gets names, default values, and meta data for all Ringtail options.

get_plot_data(bookmark_name: str = None)

Get ligand efficiency and energy for all docking data and for ligands that passed filtering in specified bookmark. Each tuple in the respective lists contains docking_score, leff, pose_id, and ligand name.

Parameters:

bookmark_name (str)

Returns:

[all_data], [filtered_data]

Return type:

list(tuple), list(tuple)

get_previous_filter_data(outfields=None, bookmark_name=None, log_file=None)

Get data requested in self.out_opts[‘outfields’] from the results bookmark of a previous filtering

Parameters:
  • outfields (str) – use outfields as described in RingtailOptions > StorageOptions

  • bookmark_name (str) – bookmark for which the filters were used

ligands_rdkit_mol(bookmark_name=None, write_nonpassing=False) dict

Creates a dictionary of RDKit mols of all ligands specified from a bookmark, either excluding (default) or including those ligands that did not pass the filter(s).

Parameters:
  • bookmark_name (str, optional) – Option to run over specified bookmark other than that just used for filtering

  • write_nonpassing (bool, optional) – Option to include non-passing poses for passing ligands

Returns:

containing ligand names, RDKit mols, flexible residue bols, and other ligand properties

Return type:

all_mols (dict)

plot(save=True, bookmark_name: str = None, return_fig_handle: bool = False)

Get data needed for creating Ligand Efficiency vs Energy scatter plot from storageManager. Call OutputManager to create plot.

Parameters:
  • save (bool) – whether to save plot to cd

  • bookmark_name (str) – bookmark from which to fetch filtered data to plot

  • return_fig_handle (bool) – use to return a handle to the matplotlib figure instead of saving or showing figure

Returns:

will not show figure if returning figure handle

Return type:

matplotlib.pyplot.figure (optional)

produce_summary(columns=['docking_score', 'leff'], percentiles=[1, 10]) None

Print summary of data in storage to sdout

Parameters:
  • columns (list(str)) – data columns used to prepare summary

  • percentiles (list(int)) – cutoff percentiles for the summary

save_receptor(receptor_file)

Add receptor to database.

Parameters:

receptor_file (str) – path to receptor file

set_filters(eworst=None, ebest=None, leworst=None, lebest=None, score_percentile=None, le_percentile=None, vdw_interactions=None, hb_interactions=None, reactive_interactions=None, hb_count=None, react_any=None, max_miss=None, ligand_name=None, ligand_operator=None, ligand_substruct=None, ligand_substruct_pos=None, ligand_max_atoms=None, dict: dict = None)

Create a filter object containing all numerical and string filters.

Parameters:
  • eworst (float) – specify the worst energy value accepted

  • ebest (float) – specify the best energy value accepted

  • leworst (float) – specify the worst ligand efficiency value accepted

  • lebest (float) – specify the best ligand efficiency value accepted

  • score_percentile (float) – specify the worst energy percentile accepted. Express as percentage e.g. 1 for top 1 percent.

  • le_percentile (float) – specify the worst ligand efficiency percentile accepted. Express as percentage e.g. 1 for top 1 percent.

  • vdw_interactions (list[tuple]) – define van der Waals interactions with residue as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [(‘A:VAL:279:’, True), (‘A:LYS:162:’, True)] -> [(‘chain:resname:resid:atomname’, <wanted (bool)>), (‘chain:resname:resid:atomname’, <wanted (bool)>)]

  • hb_interactions (list[tuple]) – define HB (ligand acceptor or donor) interaction as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [(‘A:VAL:279:’, True), (‘A:LYS:162:’, True)] -> [(‘chain:resname:resid:atomname’, <wanted (bool)>), (‘chain:resname:resid:atomname’, <wanted (bool)>)]

  • reactive_interactions (list[tuple]) – check if ligand reacted with specified residue as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [(‘A:VAL:279:’, True), (‘A:LYS:162:’, True)] -> [(‘chain:resname:resid:atomname’, <wanted (bool)>), (‘chain:resname:resid:atomname’, <wanted (bool)>)]

  • hb_count (list[tuple]) – accept ligands with at least the requested number of HB interactions. If a negative number is provided, then accept ligands with no more than the requested number of interactions. E.g., [(‘hb_count’, 5)]

  • react_any (bool) – check if ligand reacted with any residue

  • max_miss (int) – Will compute all possible combinations of interaction filters excluding up to max_miss numer of interactions from given set. Default will only return union of poses interaction filter combinations. Use with ‘enumerate_interaction_combs’ for enumeration of poses passing each individual combination of interaction filters.

  • ligand_name (list[str]) – specify ligand name(s). Will combine name filters with OR, e.g., [“lig1”, “lig2”]

  • ligand_substruct (list[str]) – SMARTS, index of atom in SMARTS, cutoff dist, and target XYZ coords, e.g., [“ccc”, “CN”]

  • ligand_substruct_pos (list[str]) – SMARTS pattern(s) for substructure matching, e.g., [‘”[Oh]C” 0 1.2 -5.5 10.0 15.5’] -> [“smart_string index_of_positioned_atom cutoff_distance x y z”]

  • ligand_max_atoms (int) – Maximum number of heavy atoms a ligand may have

  • ligand_operator (str) – logical join operator for multiple SMARTS (default: OR), either AND or OR

  • dict (dict) – dictionary of one or more of the above args, is overwritten by individual args

set_output_options(log_file: str = None, export_sdf_path: str = None, enumerate_interaction_combs: bool = None, dict: dict = None)

Creates output options object that holds attributes related to reading and outputting results. Will assign log_file name and export_sdf_path to the output_manager object.

Parameters:
  • log_file (str) – by default, results are saved in “output_log.txt”; if this option is used, ligands and requested info passing the filters will be written to specified file

  • export_sdf_path (str) – specify the path where to save poses of ligands passing the filters (SDF format); if the directory does not exist, it will be created; if it already exist, it will throw an error, unless the –overwrite is used NOTE: the log file will be automatically saved in this path. Ligands will be stored as SDF files in the order specified.

  • enumerate_interaction_combs (bool) – When used with max_miss > 0, will log ligands/poses passing each separate interaction filter combination as well as union of combinations. Can significantly increase runtime.

  • dict (dict) – dictionary of one or more of the above args, is overwritten by individual args

set_resultsman_attributes(store_all_poses: bool = None, max_poses: int = None, add_interactions: bool = None, interaction_tolerance: float = None, interaction_cutoffs: list = None, max_proc: int = None, dict: dict = None)

Create results_manager_options object if needed, sets options, and assigns them to the results manager object.

Parameters:
  • store_all_poses (bool) – store all ligand poses, does it take precedence over max poses?

  • max_poses (int) – how many poses to save (ordered by soem score?)

  • add_interactions (bool) – add ligand-receptor interaction data, only in vina mode

  • interaction_tolerance (float) – longest ångström distance that is considered interaction?

  • interaction_cutoffs (list) – ångström distance cutoffs for x and y interaction

  • max_proc (int) – max number of computer processors to use for file reading

  • dict (dict) – dictionary of one or more of the above args, is overwritten by individual args

set_storageman_attributes(filter_bookmark: str = None, duplicate_handling: str = None, overwrite: bool = None, order_results: str = None, outfields: str = None, output_all_poses: str = None, mfpt_cluster: float = None, interaction_cluster: float = None, bookmark_name: str = None, dict: dict = None)

Create storage_manager_options object if needed, sets options, and assigns them to the storage manager object.

Parameters:
  • filter_bookmark (str) – Perform filtering over specified bookmark. (in output group in CLI)

  • duplicate_handling (str, options) – specify how duplicate Results rows should be handled when inserting into database. Options are “ignore” or “replace”. Default behavior will allow duplicate entries.

  • overwrite (bool) – by default, if a log file exists, it doesn’t get overwritten and an error is returned; this option enable overwriting existing log files. Will also overwrite existing database

  • order_results (str) – Stipulates how to order the results when written to the log file. By default will be ordered by order results were added to the database. ONLY TAKES ONE OPTION.” “available fields are: ” ‘“e” (docking_score), ‘ ‘“le” (ligand efficiency), ‘ ‘“delta” (delta energy from best pose), ‘ ‘“ref_rmsd” (RMSD to reference pose), ‘ ‘“e_inter” (intermolecular energy), ‘ ‘“e_vdw” (van der waals energy), ‘ ‘“e_elec” (electrostatic energy), ‘ ‘“e_intra” (intermolecular energy), ‘ ‘“n_interact” (number of interactions), ‘ ‘“rank” (rank of ligand pose), ‘ ‘“run” (run number for ligand pose), ‘ ‘“hb” (hydrogen bonds); ‘

  • outfields (str) – defines which fields are used when reporting the results (to stdout and to the log file); fields are specified as comma-separated values, e.g. “–outfields=e,le,hb”; by default, docking_score (energy) and ligand name are reported; ligand always reported in first column available fields are: ‘ ‘“Ligand_name” (Ligand name), ‘ ‘“e” (docking_score), ‘ ‘“le” (ligand efficiency), ‘ ‘“delta” (delta energy from best pose), ‘ ‘“ref_rmsd” (RMSD to reference pose), ‘ ‘“e_inter” (intermolecular energy), ‘ ‘“e_vdw” (van der waals energy), ‘ ‘“e_elec” (electrostatic energy), ‘ ‘“e_intra” (intermolecular energy), ‘ ‘“n_interact” (number of interactions), ‘ ‘“ligand_smile” , ‘ ‘“rank” (rank of ligand pose), ‘ ‘“run” (run number for ligand pose), ‘ ‘“hb” (hydrogen bonds), ‘ ‘“receptor” (receptor name); ‘ “Fields are printed in the order in which they are provided. Ligand name will always be returned and will be added in first position if not specified.

  • output_all_poses (bool) – By default, will output only top-scoring pose passing filters per ligand. This flag will cause each pose passing the filters to be logged.

  • mfpt_cluster (float) – Cluster filered ligands by Tanimoto distance of Morgan fingerprints with Butina clustering and output ligand with lowest ligand efficiency from each cluster. Default clustering cutoff is 0.5. Useful for selecting chemically dissimilar ligands.

  • interaction_cluster (float) – Cluster filered ligands by Tanimoto distance of interaction fingerprints with Butina clustering and output ligand with lowest ligand efficiency from each cluster. Default clustering cutoff is 0.5. Useful for enhancing selection of ligands with diverse interactions.

  • bookmark_name (str) – name for resulting book mark file. Default value is “passing_results”

  • dict (dict) – dictionary of one or more of the above args, is overwritten by individual args

update_database_version(consent=False, new_version='2.0.0')

Method to update database version from earlier versions to either 1.1.0 or 2.0.0

write_flexres_pdb(receptor_polymer, ligname: str, filename: str, bookmark_name: str = None)

Writes a receptor pdb with flexible residues based on the ligand provided

Parameters:
  • receptor_polymer (Polymer) – version of receptor produced by meeko

  • ligname (str) – ligand name for which the receptor flexible residue info should be collected

  • filename (str) – name of the output pdb, extension is optional, will default to ‘.pdb’

  • bookmark_name (str, optional) – will use last used bookmark if not specified, will not work in a db without any filtering performed

write_molecule_sdfs(sdf_path: str | None = None, all_in_one: bool = True, bookmark_name: str = None, write_nonpassing: bool = None)

Have output manager write molecule sdf files for passing results in given results bookmark

Parameters:
  • sdf_path (str, optional) – Optional path existing or to be created in cd where SDF files will be saved

  • all_in_one (bool, optional) – If True will write all molecules to one SDF (separated by $$$$), if False will write one molecule pre SDF

  • bookmark_name (str, optional) – Option to run over specified bookmark other than that just used for filtering

  • write_nonpassing (bool, optional) – Option to include non-passing poses for passing ligands

Raises:

StorageError – if bookmark or data not found

ringtail.ringtailoptions module

class ringtail.ringtailoptions.Filters

Bases: RTOptions

Object that holds all optional filters.

checks()

Ensures all values are internally consistent and valid. Runs once after all values are set initially, then every time a value is changed.

classmethod get_filter_keys(group) list

Provide keys associated with each of the filter groups. :param group: includese property filters, interaction filters, ligand filters, or all filters :type group: str

Returns:

list of filter keywords associated with the specified group(s)

options = {'ebest': {'default': None, 'description': 'Specify the best energy value accepted.', 'type': <class 'float'>}, 'eworst': {'default': None, 'description': 'Specify the worst energy value accepted.', 'type': <class 'float'>}, 'hb_count': {'default': None, 'description': "Accept ligands with at least the requested number of HB interactions. If a negative number is provided, then accept ligands with no more than the requested number of interactions. E.g., [('hb_count', 5)].", 'type': <class 'list'>}, 'hb_interactions': {'default': [], 'description': "Define HB (ligand acceptor or donor) interaction as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [('A:VAL:279:', True), ('A:LYS:162:', True)] -> [('chain:resname:resid:atomname', <wanted (bool)>), ('chain:resname:resid:atomname', <wanted (bool)>)].", 'type': <class 'list'>}, 'le_percentile': {'default': None, 'description': 'Specify the worst ligand efficiency percentile accepted. Express as percentage e.g. 1 for top 1 percent.', 'type': <class 'float'>}, 'lebest': {'default': None, 'description': 'Specify the best ligand efficiency value accepted.', 'type': <class 'float'>}, 'leworst': {'default': None, 'description': 'Specify the worst ligand efficiency value accepted.', 'type': <class 'float'>}, 'ligand_max_atoms': {'default': None, 'description': 'Maximum number of heavy atoms a ligand may have.', 'type': <class 'int'>}, 'ligand_name': {'default': None, 'description': "Specify list of ligand name(s). Will combine name filters with 'OR'", 'type': <class 'list'>}, 'ligand_operator': {'default': None, 'description': "Logical join operator for multiple substruct filters. Will apply within 'ligand_substruct' filters and within 'ligand_substruct_pos' filters (the two groups are always joined by 'AND').", 'type': <class 'str'>}, 'ligand_substruct': {'default': None, 'description': "SMARTS pattern(s) for substructure matching. Will be evaluated as 'this' OR 'that' unless specified by using the ligand_operator. If error delimit each substructure with ''.", 'type': <class 'list'>}, 'ligand_substruct_pos': {'default': None, 'description': "SMARTS pattern(s) for substructure matching. For API use list with six elements ['[Oh]C', 0, 1.2, -5.5, 10.0, 15.5] -> ['smart_string', index_of_positioned_atom, cutoff_distance, x, y, z]. For the CLI use as a string without comma separators, separating each filter with commas -> '[Oh]C 0 1.2 -5.5 10.0 15.5'. Will be evaluated as 'this' OR 'that' unless specified by using the ligand_operator", 'type': <class 'list'>}, 'max_miss': {'default': 0, 'description': "Will compute all possible combinations of interaction filters excluding up to 'max_miss' number of interactions from given set. Default will only return union of poses interaction filter combinations. Use with 'enumerate_interaction_combs' for enumeration of poses passing each individual combination of interaction filters.", 'type': <class 'int'>}, 'react_any': {'default': None, 'description': 'Check if ligand reacted with any residue.', 'type': <class 'bool'>}, 'reactive_interactions': {'default': [], 'description': "Check if ligand reacted with specified residue as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [('A:VAL:279:', True), ('A:LYS:162:', True)] -> [('chain:resname:resid:atomname', <wanted (bool)>), ('chain:resname:resid:atomname', <wanted (bool)>)].", 'type': <class 'list'>}, 'score_percentile': {'default': None, 'description': 'Specify the worst energy percentile accepted. Express as percentage e.g. 1 for top 1 percent.', 'type': <class 'float'>}, 'vdw_interactions': {'default': [], 'description': "Define van der Waals interactions with residue as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [('A:VAL:279:', True), ('A:LYS:162:', True)] -> [('chain:resname:resid:atomname', <wanted (bool)>), ('chain:resname:resid:atomname', <wanted (bool)>)].", 'type': <class 'list'>}}
class ringtail.ringtailoptions.GeneralOptions

Bases: RTOptions

Object that holds choices and default values for miscellaneous arguments used for the command line interface only.

checks()

Ensures all values are internally consistent and valid. Runs once after all values are set initially, then every time a value is changed.

options = {'db_file': {'default': 'output.db', 'description': 'DB file for which to use for all Ringtail activities.', 'type': <class 'str'>}, 'debug': {'default': None, 'description': 'Print additional error information to STDOUT and to log.', 'type': <class 'bool'>}, 'docking_mode': {'default': 'dlg', 'description': "specify AutoDock program used to generate results. Available options are 'DLG' and 'vina'. Will automatically change --file_pattern to *.dlg* for DLG and *.pdbqt* for vina.", 'type': <class 'str'>}, 'print_summary': {'default': None, 'description': 'prints summary information about stored data to STDOUT.', 'type': <class 'bool'>}, 'verbose': {'default': None, 'description': 'Print results passing filtering criteria to STDOUT and to log. NOTE: runtime may be slower option used.', 'type': <class 'bool'>}}
class ringtail.ringtailoptions.InputFiles

Bases: RTOptions

Class that handles sources of data to be written including ligand data paths and how to traverse them, and options to store receptor.

checks()

Ensures all values are internally consistent and valid. Runs once after all values are set initially, then every time a value is changed.

options = {'file': {'default': None, 'description': 'Ligand docking output file to save. Compressed (.gz) files allowed. Only results files associated the same receptor allowed.', 'type': <class 'list'>}, 'file_list': {'default': None, 'description': 'Text file(s) containing the list of docking output files to save; relative or absolute paths are allowed. Compressed (.gz) files allowed.', 'type': <class 'list'>}, 'file_path': {'default': None, 'description': 'Directory(s) containing docking output files to save. Compressed (.gz) files allowed', 'type': <class 'list'>}, 'file_pattern': {'default': None, 'description': "Specify which pattern to use when searching for result files to process (only with 'file_path').", 'type': <class 'str'>}, 'receptor_file': {'default': None, 'description': 'Use with Vina mode. Give file for receptor PDBQT.', 'type': <class 'str'>}, 'recursive': {'default': None, 'description': "Enable recursive directory scan when 'file_path' is used.", 'type': <class 'bool'>}, 'save_receptor': {'default': None, 'description': "Saves receptor PDBQT to database. Receptor location must be specied with in 'receptor_file'.", 'type': <class 'bool'>}, 'target': {'default': None, 'description': "Name of receptor. This field is autopopulated if 'receptor_file' is supplied.", 'type': <class 'str'>}}
class ringtail.ringtailoptions.InputStrings

Bases: RTOptions

Class that handles docking results strings from vina docking, with options to store receptor. Takes docking results string as a dictionary of: {ligand_name: docking_result}

checks()

Ensures all values are internally consistent and valid. Runs once after all values are set initially, then every time a value is changed.

options = {'receptor_file': {'default': None, 'description': 'Use with Vina mode. Give file for receptor PDBQT.', 'type': <class 'str'>}, 'results_strings': {'default': None, 'description': 'A dictionary of ligand names and ligand docking output results. Currently only valid for vina docking', 'type': <class 'dict'>}, 'save_receptor': {'default': None, 'description': "Saves receptor PDBQT to database. Receptor location must be specied with in 'receptor_file'.", 'type': <class 'bool'>}, 'target': {'default': None, 'description': "Name of receptor. This field is autopopulated if 'receptor_file' is supplied.", 'type': <class 'str'>}}
class ringtail.ringtailoptions.OutputOptions

Bases: RTOptions

Class that holds options related to reading and output from the database, including format for result export and alternate ways of displaying the data (plotting).

checks()

Ensures all values are internally consistent and valid. Runs once after all values are set initially, then every time a value is changed.

options = {'enumerate_interaction_combs': {'default': None, 'description': "When used with 'max_miss' > 0, will log ligands/poses passing each separate interaction filter combination as well as union of combinations. Can significantly increase runtime.", 'type': <class 'bool'>}, 'export_sdf_path': {'default': '', 'description': "Specify the path where to save poses of ligands passing the filters (SDF format); if the directory does not exist, it will be created; if it already exist, it will throw an error, unless the 'overwrite' is used  NOTE: the log file will be automatically saved in this path. Ligands will be stored as SDF files in the order specified.", 'type': <class 'str'>}, 'individual_sdf_files': {'default': False, 'description': 'Use if you like to print chosen molecules to individual SDF files, as opposed to one big SDF.', 'type': <class 'bool'>}, 'log_file': {'default': 'output_log.txt', 'description': "By default, read and filtering results are saved in 'output_log.txt'; if this option is used, ligands and requested info passing the filters will be written to specified file.", 'type': <class 'str'>}}
class ringtail.ringtailoptions.RTOptions

Bases: object

Holds standard methods for the ringtail option child classes. Options can be added using this format: options = {

“”:{

“default”:’’, “type”:’’, “description”: “”

},

}

initialize_from_dict(dict: dict, name)

Initializes a child objects using the values available in their option dictionary.

Parameters:
  • dict (dict) – of attributes to be initialized to the object

  • name (str) – name of the childclass/object

classmethod is_valid_path(path)

Checks if path exist in current directory.

Parameters:

path (str)

Returns:

if path exist

Return type:

bool

todict()

Return class and its attributes as a dict of native types and not as objects (which they are if they are type checked using TypeSafe).

static valid_bookmark_name(name) bool

Checks that bookmark name adheres to sqlite naming conventions of alphanumerical and limited symbols.

Parameters:

name (str) – bookmark name

Returns:

true if bookmark name is valid

Return type:

bool

class ringtail.ringtailoptions.ReadOptions

Bases: RTOptions

Object that holds choices and default values for read and export modes, mostly used for the command line interface.

checks()

Ensures all values are internally consistent and valid. Runs once after all values are set initially, then every time a value is changed.

options = {'data_from_bookmark': {'default': None, 'description': 'Write log of --outfields data for bookmark specified by --bookmark_name. Must use without any filters.', 'type': <class 'bool'>}, 'export_bookmark_csv': {'default': None, 'description': 'Create csv of the bookmark given with bookmark_name. Output as <bookmark_name>.csv. Can also export full database tables.', 'type': <class 'str'>}, 'export_bookmark_db': {'default': None, 'description': 'Export a database containing only the results found in the bookmark specified by --bookmark_name. Will save as <input_db>_<bookmark_name>.db', 'type': <class 'bool'>}, 'export_query_csv': {'default': None, 'description': 'Create csv of the requested SQL query. Output as query.csv. MUST BE PRE-FORMATTED IN SQL SYNTAX e.g. SELECT [columns] FROM [table] WHERE [conditions]', 'type': <class 'str'>}, 'export_receptor': {'default': None, 'description': 'Export stored receptor pdbqt. Will write to current directory.', 'type': <class 'bool'>}, 'find_similar_ligands': {'default': None, 'description': 'Allows user to find similar ligands to given ligand name based on previously performed morgan fingerprint or interaction clustering.', 'type': <class 'str'>}, 'plot': {'default': None, 'description': 'Makes scatterplot of LE vs Best Energy, saves as scatter.png.', 'type': <class 'bool'>}, 'pymol': {'default': None, 'description': 'Lauch PyMOL session and plot of ligand efficiency vs docking score for molecules in bookmark specified with --bookmark_name. Will display molecule in PyMOL when clicked on plot. Will also open receptor if given.', 'type': <class 'bool'>}}
class ringtail.ringtailoptions.ResultsProcessingOptions

Bases: RTOptions

Class that holds database write options that affects write time, such as how to break up data files, number of computer processes to use, and and how many poses to store.

checks()

Ensures all values are internally consistent and valid. Runs once after all values are set initially, then every time a value is changed.

options = {'add_interactions': {'default': False, 'description': "Find interactions between ligand poses and receptor and save to database. Requires receptor PDBQT to be given with input files (all modes) and 'receptor_file' to be specified with Vina mode. SIGNIFICANTLY INCREASES DATBASE WRITE TIME.", 'type': <class 'bool'>}, 'interaction_cutoffs': {'default': [3.7, 4.0], 'description': "Use with 'add_interactions', specify distance cutoffs for measuring interactions between ligand and receptor in angstroms. Give as string, separating cutoffs for hydrogen bonds and VDW with comma (in that order). E.g. '-ic 3.7,4.0' will set the cutoff for hydrogen bonds to 3.7 angstroms and for VDW to 4.0. These are the default cutoffs.", 'type': <class 'list'>}, 'interaction_tolerance': {'default': None, 'description': 'Will add the interactions for poses within some tolerance RMSD range of the top pose in a cluster to that top pose. Can use as flag with default tolerance of 0.8 for cmd line tool, or give other value as desired (cmd line and api). Only compatible with ADGPU mode.', 'type': <class 'float'>}, 'max_poses': {'default': 3, 'description': 'Store top pose for top n clusters.', 'type': <class 'int'>}, 'max_proc': {'default': None, 'description': 'Maximum number of processes to create during parallel file parsing. Defaults to number of CPU processors.', 'type': <class 'int'>}, 'store_all_poses': {'default': False, 'description': "Store all poses from input files. Overrides 'max_poses'.", 'type': <class 'bool'>}}
class ringtail.ringtailoptions.StorageOptions

Bases: RTOptions

Class that handles options for the storage (database) manager class, including conflict handling, and results clustering and ordering.

checks()

Ensures all values are internally consistent and valid. Runs once after all values are set initially, then every time a value is changed.

options = {'bookmark_name': {'default': 'passing_results', 'description': "name for resulting book mark file. Default value is 'passing_results'", 'type': <class 'str'>}, 'duplicate_handling': {'default': None, 'description': "Specify how duplicate Results rows should be handled when inserting into database. Options are 'ignore' or 'replace'. Default behavior (no option provided) will allow duplicate entries.", 'type': <class 'str'>}, 'filter_bookmark': {'default': None, 'description': 'Perform filtering over specified bookmark.', 'type': <class 'str'>}, 'interaction_cluster': {'default': None, 'description': 'Cluster filtered ligands by Tanimoto distance of interaction fingerprints with Butina clustering and output ligand with lowest ligand efficiency from each cluster. Useful for enhancing selection of ligands with diverse interactions.', 'type': <class 'float'>}, 'mfpt_cluster': {'default': None, 'description': 'Cluster filtered ligands by Tanimoto distance of Morgan fingerprints with Butina clustering and output ligand with lowest ligand efficiency from each cluster. Useful for selecting chemically dissimilar ligands.', 'type': <class 'float'>}, 'order_results': {'default': None, 'description': "Stipulates how to order the results when written to the log file. By default will be ordered by order results were added to the database. ONLY TAKES ONE OPTION.\n                            available fields are:  \n                            'e' (docking_score), \n                            'le' (ligand efficiency), \n                            'delta' (delta energy from best pose), \n                            'ref_rmsd' (RMSD to reference pose), \n                            'e_inter' (intermolecular energy), \n                            'e_vdw' (van der waals energy), \n                            'e_elec' (electrostatic energy), \n                            'e_intra' (intermolecular energy), \n                            'n_interact' (number of interactions), \n                            'rank' (rank of ligand pose), \n                            'run' (run number for ligand pose), \n                            'hb' (hydrogen bonds); ", 'type': <class 'str'>}, 'outfields': {'default': 'Ligand_name,e', 'description': "Defines which fields are used when reporting the results (to stdout and to the log file). Fields are specified as comma-separated values, e.g. 'outfields=e,le,hb'; by default, docking_score (energy) and ligand name are reported. Ligand always reported in first column available fields are: \n\n                            'Ligand_name' (Ligand name), \n                            'e' (docking_score), \n                            'le' (ligand efficiency), \n                            'delta' (delta energy from best pose), \n                            'ref_rmsd' (RMSD to reference pose), \n                            'e_inter' (intermolecular energy), \n                            'e_vdw' (van der waals energy), \n                            'e_elec' (electrostatic energy), \n                            'e_intra' (intermolecular energy), \n                            'n_interact' (number of iteractions), \n                            'ligand_smile' , \n                            'rank' (rank of ligand pose), \n                            'run' (run number for ligand pose), \n                            'hb' (hydrogen bonds), \n                            'receptor' (receptor name);", 'type': <class 'str'>}, 'output_all_poses': {'default': None, 'description': 'By default, will output only top-scoring pose passing filters per ligand. This flag will cause each pose passing the filters to be logged.', 'type': <class 'bool'>}, 'overwrite': {'default': None, 'description': "This option will allow overwriting of the database (in 'write'/add files-mode) and filtering log_file (in 'read'/filtering mode).", 'type': <class 'bool'>}}
order_options = {'delta', 'e', 'e_elec', 'e_inter', 'e_intra', 'e_vdw', 'hb', 'le', 'n_interact', 'rank', 'ref_rmsd', 'run'}
class ringtail.ringtailoptions.TypeSafe(default, type, object_name)

Bases: object

Class that handles safe typesetting of values of a specified built-in type. Any attribute can be set as a TypeSafe object, this ensures its type is checked whenever it is changed. This makes the attribute of type ‘object’ as opposed to its actual type. To return the value of an attribute as a native type value, you can create a ‘__getattribute__’ method in the class that holds the attribute (see e.g., RTOptions).

It is the hope to extend this to work with custom types, such as “percentage” (float with a max and min value), and direcotry (string that must end with ‘/’).

Parameters:
  • object_name (str) – name of type safe instance

  • type (type) – any of the native types in python that the instance must adhere to

  • default (any) – default value of the object, can be any including None

  • value (any) – value of type type assigned to instance, can be same or different than default

Raises:

OptionError – if wrong type is attempted.

ringtail.storagemanager module

class ringtail.storagemanager.StorageManager

Bases: object

check_passing_bookmark_exists(bookmark_name: str | None = None)

Checks if bookmark name is in database

Parameters:

bookmark_name (str, optional) – name of bookmark name to check if exist, or else will use storageman bookmark_name attribute

Returns:

indicates if bookmark_name exists in the current database

Return type:

bool

check_storage_compatibility()

Checks if chosen storage type has been implemented

Parameters:

storage_type (str) – name of the storage type

Raises:

NotImplementedError – raised if seelected storage type has not been implemented

Returns:

of implemented storage type

Return type:

class

close_storage(attached_db=None, vacuum=False)

Close connection to database

Parameters:
  • attached_db (str, optional) – name of attached DB (not including file extension)

  • vacuum (bool, optional) – indicates that database should be vacuumed before closing

crossref_filter(new_db: str, bookmark1_name: str, bookmark2_name: str, selection_type='-', old_db=None) tuple

Selects ligands found or not found in the given bookmark in both current db and new_db. Stores as temp view

Parameters:
  • new_db (str) – file name for database to attach

  • bookmark1_name (str) – string for name of first bookmark/temp table to compare

  • bookmark2_name (str) – string for name of second bookmark to compare

  • selection_type (str) – “+” or “-” indicating if ligand names should (“+”) or should not “-” be in both databases

  • old_db (str, optional) – file name for previous database

Returns:

(name of new bookmark (str), number of ligands passing new bookmark (int))

Return type:

tuple

field_to_column_name = {'Ligand_name': 'LigName', 'delta': 'deltas', 'e': 'docking_score', 'e_elec': 'energies_electro', 'e_inter': 'energies_inter', 'e_intra': 'energies_intra', 'e_vdw': 'energies_vdw', 'hb': 'num_hb', 'interactions': 'interactions', 'le': 'leff', 'ligand_smile': 'ligand_smile', 'n_interact': 'nr_interactions', 'rank': 'pose_rank', 'receptor': 'receptor', 'ref_rmsd': 'reference_rmsd', 'run': 'run_number'}
filter_results(all_filters: dict, suppress_output=False) iter

Generate and execute database queries from given filters.

Parameters:
  • all_filters (dict) – dict containing all filters. Expects format and keys corresponding to ringtail.Filters().todict()

  • suppress_output (bool) – prints filtering summary to sdout

Returns:

iterable, such as an sqlite cursor, of passing results

Return type:

iter

finalize_database_write()

Methods to finalize when a database has been written to, and saving the current database schema to the sqlite database.

get_plot_data(bookmark_name: str = None, only_passing=False)

This function is expected to return an ascii plot representation of the results

Parameters:
  • bookmark_name (str) – name of bookmark for which to fetch passing data. Will use default bookmark name if None. Returns empty list if bookmark does not exist.

  • only_passing (bool) – Only return data for passing ligands. Will return empty list for all data.

Returns:

cursors as (<all data cursor>, <passing data cursor>)

Return type:

tuple

insert_data(results_array, ligands_array, interaction_list, receptor_array=[], insert_receptor=False)

Inserts data from all arrays returned from results manager.

Parameters:
  • results_array (list) – list of data to be stored in Results table

  • ligands_array (list) – list of data to be stored in Ligands table

  • interaction_list (list) – list of data to be stored in interaction tables

  • receptor_array (list) – list of data to be stored in Receptors table

  • insert_receptor (bool, optional) – flag indicating that receptor info should inserted

insert_interactions(Pose_IDs: list, interactions_list, duplicates)

Takes list of interactions, inserts into database

Parameters:
  • Pose_IDs (list(int)) – list of pose ids assigned while writing the current results to database

  • interactions_list (list) – List of tuples for interactions in form (“type”, “chain”, “residue”, “resid”, “recname”, “recid”)

  • duplicates (list(Pose_ID)) – any duplicates identified in “insert_results”, if duplicate handling has been specified

prune()

Deletes rows from results, ligands, and interactions in a bookmark if they do not pass filtering criteria

class ringtail.storagemanager.StorageManagerSQLite(db_file: str = None, overwrite: bool = None, order_results: str = None, outfields: str = None, filter_bookmark: str = None, output_all_poses: bool = None, mfpt_cluster: float = None, interaction_cluster: float = None, bookmark_name: str = None, duplicate_handling: str = None)

Bases: StorageManager

SQLite-specific StorageManager subclass

conn

Connection to database

Type:

SQLite.conn

open_cursors

list of cursors that were not closed by the function that created them. Will be closed by close_connection method.

Type:

list

db_file

database name

Type:

str

overwrite

switch to overwrite database if it exists

Type:

bool

order_results

what column name will be used to order results once read

Type:

str

outfields

data fields/columns to include when reading and outputting data

Type:

str

filter_bookmark

name of bookmark that filtering will be performed over

Type:

str

output_all_poses

whether or not to output all poses of a ligand

Type:

bool

mfpt_cluster

distance in ångströms to cluster ligands based on morgan fingerprints

Type:

float

interaction_cluster

distance in ångströms to cluster ligands based on interactions

Type:

float

bookmark_name

name of current bookmark being written to or read from

Type:

str

duplicate_handling

optional attribute to deal with insertion of ligands already in the database

Type:

str

current_bookmark_name

name of last view to have been written to in the database

Type:

str

filtering_window

name of bookmark/view being filtered on

Type:

str

index_columns
Type:

list

view_suffix

current suffix for views

Type:

int

temptable_suffix

current suffix for temporary tables

Type:

int

field_to_column_name

Dictionary for converting ringtail options into DB column names

Type:

dict

bookmark_has_rows(bookmark_name: str) bool

Method that checks if a given bookmark has any data in it

Parameters:

bookmark_name (str) – view to check

Returns:

True if more than zero rows in bookmark

Return type:

bool

check_ringtaildb_version()

Checks the database version and confirms whether the code base is compatible with it

Returns:

whether or not db is compatible with the code base str: current database versions

Return type:

bool

check_storage_ready(run_mode: str, docking_mode: str, store_all_poses: bool, max_poses: int)

Check that storage is ready before proceeding, and creates new tables if needed

Parameters:
  • run_mode (str) – if ringtail is ran using cmd line interface or api

  • docking_mode (str) – what docking engine was used to produce results

  • store_all_poses (bool) – overrwrites max poses

  • max_poses (int) – max poses to save to db

Raises:
clone(backup_name=None)

Creates a copy of the db

Parameters:

backup_name (str, optional) – name of the cloned database

count_receptors_in_db()

returns number of rows in Receptors table where receptor_object already has blob

Returns:

number of rows in receptors table str: name of receptor if present in table

Return type:

int

Raises:

DatabaseQueryError

create_bookmark(name, query, temp=False, add_poseID=False, filters={})

Takes name and selection query and creates a bookmark of name. Bookmarks are Ringtail specific views that whose information is stored in the ‘Bookmark’ table. #FIXME bug where ligand filter only results are not added as bookmarks

Parameters:
  • name (str) – Name for bookmark which will be created

  • query (str) – SQLite-formated query used to create bookmark

  • temp (bool, optional) – Flag if bookmark should be temporary

  • add_poseID (bool, optional) – Add Pose_ID column to bookmark

  • filters (dict, optional) – a dict of filters used to construct the query

create_bookmark_from_temp_table(temp_table_name, bookmark_name, original_bookmark_name, wanted_list, unwanted_list=[])

Resaves temp bookmark stored in self.current_bookmark_name as new permenant bookmark

Parameters:
  • bookmark_name (str) – name of bookmark to save last temp bookmark as

  • original_bookmark_name (str) – name of original bookmark

  • wanted_list (list) – List of wanted database names

  • unwanted_list (list, optional) – List of unwanted database names

  • temp_table_name (str) – name of temporary table

create_temp_table_from_bookmark()

Method that creates a temporary table named “passing_temp”. Please note that this table will be dropped as soon as the database connection closes.

drop_bookmark(bookmark_name: str)

Drops specified bookmark from database

Parameters:

bookmark_name (str) – bookmark to be dropped

Raises:

DatabaseInsertionError

fetch_bookmark(bookmark_name: str) Cursor

returns SQLite cursor of all fields in bookmark

Parameters:

bookmark_name (str) – name of bookmark to retrieve

Returns:

cursor of requested view

Return type:

sqlite3.Cursor

fetch_clustered_similars(ligname: str)

Given ligname, returns poseids for similar poses/ligands from previous clustering. User prompted at runtime to choose cluster.

Parameters:

ligname (str) – ligname for ligand to find similarity with

Raises:
fetch_data_for_passing_results() iter

Will return SQLite cursor with requested data for outfields for poses that passed filter in self.bookmark_name

Returns:

sqlite cursor of data from passing data

Return type:

iter

Raises:

OptionError

fetch_filters_from_bookmark(bookmark_name: str | None = None)

Method that will retrieve filter values used to construct bookmark

Parameters:
  • bookmark_name (str, optional) – can get filter values for given bookmark, or filter values from currently active bookmark in storageman

  • Returns – dict: containing the filter data

fetch_flexres_info()

fetch flexible residues names and atomname lists

Returns:

(flexible_residues, flexres_atomnames)

Return type:

tuple

fetch_interaction_info_by_index(interaction_idx) tuple

Returns tuple containing interaction info for given interaction_idx

Parameters:

interaction_idx (int) – interaction index to fetch info for

Returns:

tuple of info for requested interaction

Return type:

tuple

fetch_nonpassing_pose_properties(ligname)

fetch coordinates for poses of ligname which did not pass the filter

Parameters:

ligname (str) – name of ligand to fetch coordinates for

Returns:

SQLite cursor that contains Pose_ID, docking_score, leff, ligand_coordinates,

flexible_res_coordinates, flexible_residues

Return type:

iter

fetch_passing_ligand_output_info() iter

fetch information required by vsmanager for writing out molecules

Returns:

contains LigName, ligand_smile,

atom_index_map, hydrogen_parents

Return type:

iter

fetch_passing_pose_properties(ligname)

fetch coordinates for poses passing filter for given ligand

Parameters:

ligname (str) – name of ligand to fetch coordinates for

Returns:

SQLite cursor that contains Pose_ID, docking_score, leff, ligand_coordinates,

flexible_res_coordinates, flexible_residues

Return type:

iter

fetch_pose_interactions(Pose_ID) iter

Fetch all interactions parameters belonging to a Pose_ID

Parameters:

Pose_ID (int) – pose id, 1-1 with Results table

Returns:

of interaction information for given Pose_ID

Return type:

iter

fetch_receptor_object_by_name(rec_name)

Returns Receptor object from database for given rec_name

Parameters:

rec_name (str) – Name of receptor to return object for

Returns: str: receptor object as a string

fetch_receptor_objects()

Returns all Receptor objects from database

Parameters:

rec_name (str) – Name of receptor to return object for

Returns:

of receptor names and objects

Return type:

iter (tuple)

fetch_single_ligand_output_info(ligname) str

get output information for given ligand

Parameters:

ligname (str) – ligand name

Raises:

DatabaseQueryError

Returns:

information containing smiles, atom and index mapping, and hydrogen parents

Return type:

str

fetch_single_pose_properties(pose_ID: int) iter

fetch coordinates for pose given by pose_ID

Parameters:

pose_ID (int) – name of ligand to fetch coordinates for

Returns:

SQLite cursor that contains Pose_ID, docking_score, leff, ligand_coordinates,

flexible_res_coordinates, flexible_residues

Return type:

iter

fetch_summary_data(columns=['docking_score', 'leff'], percentiles=[1, 10]) dict
Collect summary data for database:

Num Ligands Num stored poses Num unique interactions

min, max, percentiles for columns in columns

Parameters:
  • columns (list (str)) – columns to be displayed and used in summary

  • percentiles (list(int)) – percentiles to consider

Returns:

of data summary

Return type:

dict

classmethod format_for_storage(ligand_dict: dict) tuple

takes file dictionary from the file parser, formats required storage format

Parameters:

ligand_dict (dict) – Dictionary containing data from the fileparser

Returns:

of lists ([result_row_1, result_row_2,…],

ligand_row, [interaction_tuple_1, interaction_tuple_2, …])

Return type:

tuple

get_all_bookmark_names()

Get all bookmarks in sql database as a list of names. Bookmarks are a specific type of sqlite-views whose information is stored in the Bookmarks table.

Returns:

of bookmark names

Return type:

list

get_current_bookmark_name()

returns current bookmark name

Returns:

name of last passing results bookmark used by database

Return type:

str

get_maxmiss_union(total_combinations: int)

Get results that are in union considering max miss

Parameters:

total_combinations (int) – numer of possible combinations

Returns:

of passing results

Return type:

iter

insert_receptor_blob(receptor, rec_name)

Takes object of Receptor class, updates the column in Receptor table

Parameters:
  • receptor (bytes) – bytes receptor object to be inserted into DB

  • rec_name (string) – Name of receptor. Used to insert into correct row of DB

Raises:

DatabaseInsertionError – Description

overwrite_storage()

Will drop all tables in the database.

set_bookmark_suffix(suffix)

Sets internal bookmark_suffix variable

Parameters:

suffix (str) – suffix to attached to bookmark-related queries or creation

to_dataframe(requested_data: str, table=True) pandas.DataFrame

Returns a panda dataframe of table or query given as requested_data

Parameters:
  • requested_data (str) – String containing SQL-formatted query or table name

  • table (bool) – Flag indicating if requested_data is table name or not

Returns:

dataframe of requested data

Return type:

pd.DataFrame

update_database_version(new_version, consent=False)

method that updates sqlite database schema 1.0.0 or 1.1.0 to 1.1.0 or 2.0.0

#NOTE: If you created a version 1 database with the duplicate handling option, there is a chance of inconsistent behavior of anything involving interactions as the Pose_ID was not used as an explicit foreign key in db v1.0.0 and v1.1.0.

Parameters:

consent (bool, optional) – variable to ensure consent to update database is explicit

Returns:

bool

Module contents

class ringtail.CLOptionParser

Bases: object

Command line option/argument parser. Options and switches are utilized in the script ‘rt_process_vs.py’.

process_mode

operating in ‘write’ or ‘read’ mode

Type:

str

rtcore

ringtail core object initialized with the provided db_file

Type:

RingtailCore

filters

fully parsed and organied optional filters

Type:

dict

file_sources

fully parsed docking results and receptor files

Type:

dict

writeopts

fully parsed arguments related to database writing

Type:

dict

storageopts

fully parsed arguments related to how the storage system behaves

Type:

dict

outputopts

fully parsed arguments related to output and reading from the database

Type:

dict

print_summary

switch to print database summary

Type:

bool

filtering

switch to run filtering method

Type:

bool

plot

switch to plot the data

Type:

bool

export_bookmark_db

switch to export bookmark as a new database

Type:

bool

export_receptor

switch to export receptor information to pdbqt

Type:

bool

pymol

switch to visualize ligands in pymol

Type:

bool

data_from_bookmark

switch to write bookmark data to the output log file

Type:

bool

Raises:

OptionError – Error when an option cannot be parsed correctly

process_options(parsed_opts)

Process and organize command line options to into ringtail options and filter dictionaries and ringtail core attributes

Parameters:

parsed_opts (argparse.Namespace) – arguments provided through the cmdline_parser method.

exception ringtail.DatabaseConnectionError

Bases: StorageError

exception ringtail.DatabaseInsertionError

Bases: StorageError

exception ringtail.DatabaseQueryError

Bases: StorageError

exception ringtail.DatabaseTableCreationError

Bases: StorageError

exception ringtail.DatabaseViewCreationError

Bases: StorageError

class ringtail.DockingFileReader(*args: Any, **kwargs: Any)

Bases: Process

This class is the individual worker for processing docking results. One instance of this class is instantiated for each available processor.

queueIn

current queue for the processor/file reader

Type:

multiprocess.Queue

queueOut

queue for the processor/file reader after adding or removing an item

Type:

multiprocess.Queue

pipe_conn

pipe connection to the reader

Type:

multiprocess.Pipe

storageman

storageman object

Type:

StorageManager

storageman_class

storagemanager child class/database type

Type:

StorageManager

docking_mode

describes what docking engine was used to produce the results

Type:

str

max_poses

max number of poses to store for each ligand

Type:

int

interaction_tolerance

Will add the interactions for poses within some tolerance RMSD range of the top pose in a cluster to that top pose.”

Type:

float

store_all_poses

Store all poses from docking results

Type:

bool

add_interactions

find and save interactions between ligand poses and receptor

Type:

bool

interaction_cutoffs

cutoff for interactions of hydrogen bonds and VDW interactions, in ångströms

Type:

list(float)

target

receptor name

Type:

str

run()

Method overload from parent class .This is where the task of this class is performed. Each multiprocess.Process class must have a “run” method which is called by the initialization (see below) with start()

Raises:
  • NotImplementedError – if parser for specific docking result type is not implemented

  • FileParsingError

exception ringtail.FileParsingError

Bases: Exception

class ringtail.Filters

Bases: RTOptions

Object that holds all optional filters.

checks()

Ensures all values are internally consistent and valid. Runs once after all values are set initially, then every time a value is changed.

classmethod get_filter_keys(group) list

Provide keys associated with each of the filter groups. :param group: includese property filters, interaction filters, ligand filters, or all filters :type group: str

Returns:

list of filter keywords associated with the specified group(s)

options = {'ebest': {'default': None, 'description': 'Specify the best energy value accepted.', 'type': <class 'float'>}, 'eworst': {'default': None, 'description': 'Specify the worst energy value accepted.', 'type': <class 'float'>}, 'hb_count': {'default': None, 'description': "Accept ligands with at least the requested number of HB interactions. If a negative number is provided, then accept ligands with no more than the requested number of interactions. E.g., [('hb_count', 5)].", 'type': <class 'list'>}, 'hb_interactions': {'default': [], 'description': "Define HB (ligand acceptor or donor) interaction as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [('A:VAL:279:', True), ('A:LYS:162:', True)] -> [('chain:resname:resid:atomname', <wanted (bool)>), ('chain:resname:resid:atomname', <wanted (bool)>)].", 'type': <class 'list'>}, 'le_percentile': {'default': None, 'description': 'Specify the worst ligand efficiency percentile accepted. Express as percentage e.g. 1 for top 1 percent.', 'type': <class 'float'>}, 'lebest': {'default': None, 'description': 'Specify the best ligand efficiency value accepted.', 'type': <class 'float'>}, 'leworst': {'default': None, 'description': 'Specify the worst ligand efficiency value accepted.', 'type': <class 'float'>}, 'ligand_max_atoms': {'default': None, 'description': 'Maximum number of heavy atoms a ligand may have.', 'type': <class 'int'>}, 'ligand_name': {'default': None, 'description': "Specify list of ligand name(s). Will combine name filters with 'OR'", 'type': <class 'list'>}, 'ligand_operator': {'default': None, 'description': "Logical join operator for multiple substruct filters. Will apply within 'ligand_substruct' filters and within 'ligand_substruct_pos' filters (the two groups are always joined by 'AND').", 'type': <class 'str'>}, 'ligand_substruct': {'default': None, 'description': "SMARTS pattern(s) for substructure matching. Will be evaluated as 'this' OR 'that' unless specified by using the ligand_operator. If error delimit each substructure with ''.", 'type': <class 'list'>}, 'ligand_substruct_pos': {'default': None, 'description': "SMARTS pattern(s) for substructure matching. For API use list with six elements ['[Oh]C', 0, 1.2, -5.5, 10.0, 15.5] -> ['smart_string', index_of_positioned_atom, cutoff_distance, x, y, z]. For the CLI use as a string without comma separators, separating each filter with commas -> '[Oh]C 0 1.2 -5.5 10.0 15.5'. Will be evaluated as 'this' OR 'that' unless specified by using the ligand_operator", 'type': <class 'list'>}, 'max_miss': {'default': 0, 'description': "Will compute all possible combinations of interaction filters excluding up to 'max_miss' number of interactions from given set. Default will only return union of poses interaction filter combinations. Use with 'enumerate_interaction_combs' for enumeration of poses passing each individual combination of interaction filters.", 'type': <class 'int'>}, 'react_any': {'default': None, 'description': 'Check if ligand reacted with any residue.', 'type': <class 'bool'>}, 'reactive_interactions': {'default': [], 'description': "Check if ligand reacted with specified residue as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [('A:VAL:279:', True), ('A:LYS:162:', True)] -> [('chain:resname:resid:atomname', <wanted (bool)>), ('chain:resname:resid:atomname', <wanted (bool)>)].", 'type': <class 'list'>}, 'score_percentile': {'default': None, 'description': 'Specify the worst energy percentile accepted. Express as percentage e.g. 1 for top 1 percent.', 'type': <class 'float'>}, 'vdw_interactions': {'default': [], 'description': "Define van der Waals interactions with residue as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [('A:VAL:279:', True), ('A:LYS:162:', True)] -> [('chain:resname:resid:atomname', <wanted (bool)>), ('chain:resname:resid:atomname', <wanted (bool)>)].", 'type': <class 'list'>}}
class ringtail.InteractionFinder(rec_string, interaction_cutoff_radii)

Bases: object

Class for handling and calculating ligand-receptor interactions.

rec_string

string describing the receptor

Type:

str

interaction_cutoff_radii

cutoff for interactions of hydrogen bonds and VDW interactions, in ångströms

Type:

list(float)

find_pose_interactions(lig_atomtype_list: list, lig_coordinates: list) dict

Method that identifies interactions for a pose within th given cutoff distances in the main class.

Parameters:
  • lig_atomtype_list (list) – list of atoms in the ligand

  • lig_coordinates (list) – coordinates for the atoms in the ligand

Returns:

all interaction details for a given ligand pose

Return type:

dict

class ringtail.MPManager(docking_mode, max_poses, interaction_tolerance, store_all_poses, add_interactions, interaction_cutoffs, max_proc, storageman, storageman_class, chunk_size, target, receptor_file, file_pattern=None, file_sources=None, string_sources=None)

Bases: object

Manager that orchestrates paralell processing of docking results data, using one of the supported multiprocessors.

docking_mode

describes what docking engine was used to produce the results

Type:

str

max_poses

max number of poses to store for each ligand

Type:

int

interaction_tolerance

Will add the interactions for poses within some tolerance RMSD range of the top pose in a cluster to that top pose.”

Type:

float

store_all_poses

Store all poses from docking results

Type:

bool

add_interactions

find and save interactions between ligand poses and receptor

Type:

bool

interaction_cutoffs

cutoff for interactions of hydrogen bonds and VDW interactions, in ångströms

Type:

list(float)

max_proc

Maximum number of processes to create during parallel file parsing.

Type:

int

storageman

storageman object

Type:

StorageManager

storageman_class

storagemanager child class/database type

Type:

StorageManager

chunk_size

how many tasks ot send to a processor at the time

Type:

int

target

name of receptor

Type:

str

receptor_file

file path to receptor

Type:

str

file_pattern

file pattern to look for if recursively finding results files to process

Type:

str, optional

file_sources

RingtailOption object that holds all attributes related to results files

Type:

InputFiles, optional

string_sources

RingtailOption object that holds all attributes related to results strings

Type:

InputStrings, optional

num_files

number of files processed at any given time

Type:

int

process_results()

Processes results data (files or string sources) by adding them to the queue and starting their processing in multiprocess.

exception ringtail.MultiprocessingError

Bases: Exception

exception ringtail.OptionError

Bases: Exception

exception ringtail.OutputError

Bases: Exception

class ringtail.OutputManager(log_file=None, export_sdf_path=None)

Bases: object

Class for creating outputs, can be a context manager to handle log files

log_file

name for log file

Type:

str

export_sdf_path

path for exporting SDF molecule files

Type:

str

_log_open

if log file is open or not

Type:

bool

close_logfile()

Closes the log file properly and reset file pointer to filename

log_num_passing_ligands(number_passing_ligands: int)

Write the number of ligands which pass given filter to log file

Parameters:

number_passing_ligands (int) – number of ligands that passed filter

Raises:

OutputError

open_logfile(write_filters_header=True)

Opens log file and creates it if needed

Parameters:

write_filters_header (bool) – only used because one method does not take the same headers

Raises:

OutputError

plot_all_data(xdata, ydata, num_of_bins: int = 100)

Takes dictionary of binned data where key is the coordinates of the bin and value is the number of points in that bin. Adds to scatter plot colored by value

Parameters:
  • xdata (list) – list of x axis data (needs to be same length as ydata)

  • ydata (list) – list of y axis data (needs to be same length as xdata)

  • num_of_bins (int) – number of bins to organize data in

Returns:

matplotlib.pyplot.figure

Raises:

OutputError

plot_single_points(x: list, y: list, markersize: int = 20, color='crimson')

Add points to scatter plot with given x and y coordinates and color.

Parameters:
  • x (float) – x coordinate

  • y (float) – y coordinate

  • color (str, optional) – Color for point. Default black.

Raises:

OutputError

save_scatterplot()

Saves current figure as scatter.png

Raises:

OutputError

scatter_hist(x, y, z, ax_histx, ax_histy)

Makes scatterplot with a histogram on each axis

Parameters:
  • x (list) – x coordinates for data

  • y (list) – y coordinates for data

  • z (list) – z coordinates for data

  • ax (matplotlib.axis) – scatterplot axis

  • ax_histx (matplotlib.axis) – x histogram axis

  • ax_histy (matplotlib.axis) – y histogram axis

Raises:

OutputError

write_filter_log(lines)

Writes lines from results iterable into log file

Parameters:

lines (iterable) – Iterable with tuples of data for writing into log

Raises:

OutputError

Returns:

number of ligands passing that are written to log file

Return type:

int

write_filters_to_log(filters_dict, included_interactions, additional_info='')

Takes dictionary of filters, formats as string and writes to log file

Parameters:
  • filters_dict (dict) – dictionary with filtering options

  • included_interactions (list) – types of interactions to include in the filtering

  • additional_info (str) – any additional information to write to top of log file

Raises:

OutputError

write_find_similar_header(query_ligname, cluster_name)

Properly formats header for the log file find_similar_ligands

write_maxmiss_union_header()

Properly formats header for the log file if using max_miss and enumerate_interaction_combs

write_out_mol(filename, mol, flexres_mols, properties)

Writes out given mol as sdf. Will create the specified sdf folder in current working directory if needed.

Parameters:
  • filename (str) – name of SDF file that will be written to

  • mol (RDKit.Chem.Mol) – RDKit molobject to be written to SDF

  • flexres_mols (list) – dictionary of rdkit molecules for flexible residues

  • properties (dict) – dictionary of list of properties to add to mol before writing

Raises:

OutputError

write_receptor_pdbqt(recname: str, receptor_compbytes)

Writes a pdbqt file from receptor “blob”

Parameters:
  • recname (str) – name of receptor to use in output filename

  • receptor_compbytes (blob) – receptor blob

write_results_bookmark_to_log(bookmark_name)

Write the name of the result bookmark into log

Parameters:

bookmark_name (str) – name of current results’ bookmark in db

Raises:

OutputError

exception ringtail.RTCoreError

Bases: Exception

class ringtail.ReceptorManager

Bases: object

Class with methods dealing with formatting of receptor information

static blob2str(receptor_blob)

Creates blob of compresser receptor file info

Parameters:

receptor_blob (blob) – zipped receptor blob

Returns:

receptor string

Return type:

str

static make_receptor_blobs(file_list)

Creates compressed receptor info

Parameters:

file_list (str) – path to receptor file

Returns:

compressed receptor

Return type:

blob

class ringtail.ResultsManager(docking_mode: str = None, max_poses: int = None, interaction_tolerance: float = None, store_all_poses: bool = None, add_interactions: bool = None, interaction_cutoffs: list = None, max_proc: int = None, storageman: StorageManager = None, storageman_class: StorageManager = None, chunk_size: int = 1, parser_manager: str = 'multiprocess', file_sources=None, string_sources=None)

Bases: object

Class that handles the processing of the results, including passing on the docking results to the appropriate paralell/multi-processing unit

Parameters:
  • max_poses (int) – max number of poses to store for each ligand

  • interaction_tolerance (float) – Will add the interactions for poses within some tolerance RMSD range of the top pose in a cluster to that top pose.”

  • store_all_poses (bool) – Store all poses from docking results

  • add_interactions (bool) – find and save interactions between ligand poses and receptor

  • interaction_cutoffs (list(float)) – cutoff for interactions of hydrogen bonds and VDW interactions, in ångströms

  • max_proc (int) – Maximum number of processes to create during parallel file parsing.

  • storageman (StorageManager) – storageman object

  • storageman_class (StorageManager) – storagemanager child class/database type

  • chunk_size (int) – how many tasks ot send to a processor at the time

  • parser_manager (str, optional) – what paralellization or multiprocessing package to use

  • file_sources (InputFiles, optional) – given file sources including the receptor file

  • string_sources (InputStrings, optional) – given string sources including the path to the receptor

Raises:

ResultsProcessingError

process_docking_data()

Processes docking data in the form of files or strings

Raises:

ResultsProcessingError – if no file or string sources are provided, or if both are provided

exception ringtail.ResultsProcessingError

Bases: Exception

class ringtail.RingtailCore(db_file: str = 'output.db', storage_type: str = 'sqlite', docking_mode: str = 'dlg', logging_level: str = 'WARNING')

Bases: object

Core class for coordinating different actions on virtual screening including adding results to storage, filtering and clusteirng, and outputting data as rdkit molecules, plotting docking results, and visualizing select ligands in pymol.

db_file

name of database file being operated on

Type:

str

docking_mode

specifies what docking mode has been used for the results in the database

Type:

str

storageman

Interface module with database

Type:

StorageManager

resultsman

Module to deal with results processing before adding to database

Type:

ResultsManager

outputman

Manager for output tasks of log-writting, plotting, ligand SDF writing, starting pymol sessions

Type:

OutputManager

filters

object holding all optional filters

Type:

Filters

_run_mode

refers to whether ringtail is ran from the command line or through direct API use, where the former is more restrictive

Type:

str

add_results_from_files(file: str = None, file_path: str = None, file_list: str = None, file_pattern: str = None, recursive: bool = None, receptor_file: str = None, save_receptor: bool = None, filesources_dict: dict = None, duplicate_handling: str = None, overwrite: bool = None, store_all_poses: bool = None, max_poses: int = None, add_interactions: bool = None, interaction_tolerance: float = None, interaction_cutoffs: list = None, max_proc: int = None, options_dict: dict = None, finalize: bool = True)

Call storage manager to process result files and add to database. Creates or adds to an existing a database. Options can be provided as a dict or as individual options. If both are provided, individual options will overwrite those from the dictionary.

Parameters:
  • (str (file_list) – list(str)): ligand result file

  • optional – list(str)): ligand result file

  • (str – list(str)): list of folders containing one or more result files

  • optional – list(str)): list of folders containing one or more result files

  • (str – list(str)): list of ligand result file(s)

  • optional – list(str)): list of ligand result file(s)

  • file_pattern (str) – file pattern to use with recursive search in a file_path, “.dlg” for AutoDock-GDP and “.pdbqt” for vina

  • recursive (bool) – used to recursively search file_path for folders inside folders

  • receptor_file (str) – string containing the receptor .pdbqt

  • save_receptor (bool) – whether or not to store the full receptor details in the database (needed for some things)

  • filesources_dict (dict) – file sources already as an object

  • duplicate_handling (str, options) – specify how duplicate Results rows should be handled when inserting into database. Options are “ignore” or “replace”. Default behavior will allow duplicate entries.

  • store_all_poses (bool) – store all ligand poses, does it take precedence over max poses?

  • max_poses (int) – how many poses to save (ordered by soem score?)

  • add_interactions (bool) – add ligand-receptor interaction data, only in vina mode

  • interaction_tolerance (float) – longest ångström distance that is considered interaction?

  • interaction_cutoffs (list) – ångström distance cutoffs for x and y interaction

  • max_proc (int) – max number of computer processors to use for file reading

  • options_dict (dict) – write options as a dict

Raises:

OptionError

add_results_from_vina_string(results_strings: dict = None, receptor_file: str = None, save_receptor: bool = None, resultsources_dict: dict = None, duplicate_handling: str = None, overwrite: bool = None, store_all_poses: bool = None, max_poses: int = None, add_interactions: bool = None, interaction_cutoffs: list = None, max_proc: int = None, options_dict: dict = None, finalize: bool = True)

Call storage manager to process the given vina output string and add to database. Options can be provided as a dict or as individual options. Creates or adds to an existing a database.

Parameters:
  • results_string (dict) – string containing the ligand identified and docking results as a dictionary

  • receptor_file (str) – string containing the receptor .pdbqt

  • save_receptor (bool) – whether or not to store the full receptor details in the database (needed for some things)

  • resultsources_dict (dict) – file sources already as an object

  • duplicate_handling (str, options) – specify how duplicate Results rows should be handled when inserting into database. Options are “ignore” or “replace”. Default behavior will allow duplicate entries.

  • store_all_poses (bool) – store all ligand poses, does it take precedence over max poses?

  • max_poses (int) – how many poses to save (ordered by soem score?)

  • add_interactions (bool) – add ligand-receptor interaction data, only in vina mode

  • interaction_cutoffs (list) – ångström distance cutoffs for x and y interaction

  • max_proc (int) – max number of computer processors to use for file reading

  • options_dict (dict) – write options as a dict

Raises:

OptionError

static default_dict() dict

Creates a dict of all Ringtail options.

Returns:

json string with options

Return type:

str

display_pymol(bookmark_name=None)

Launch pymol session and plot of LE vs docking score. Displays molecules when clicked.

Parameters:

bookmark_name (str) – bookmark name to use in pymol. ‘None’ uses the whole db?

property docking_mode

Private method to retrieve docking mode

Returns:

docking mode

Return type:

str

drop_bookmark(bookmark_name: str)

Drops specified bookmark from the database

Parameters:

bookmark_name (str) – name of bookmark to be dropped.

export_bookmark_db(bookmark_name: str = None) str

Export database containing data from bookmark

Parameters:

bookmark_name (str) – name for bookmark_db

Returns:

name of the new, exported database

Return type:

str

export_csv(requested_data: str, csv_name: str, table=False)

Get requested data from database, export as CSV

Parameters:
  • requested_data (str) – Table name or SQL-formatted query

  • csv_name (str) – Name for exported CSV file

  • table (bool) – flag indicating is requested data is a table name

export_receptors()

Export receptor in database to pdbqt

filter(eworst=None, ebest=None, leworst=None, lebest=None, score_percentile=None, le_percentile=None, vdw_interactions=None, hb_interactions=None, reactive_interactions=None, hb_count=None, react_any=None, max_miss=None, ligand_name=None, ligand_operator=None, ligand_substruct=None, ligand_substruct_pos=None, ligand_max_atoms=None, filters_dict: dict | None = None, enumerate_interaction_combs: bool = False, output_all_poses: bool = None, mfpt_cluster: float = None, interaction_cluster: float = None, log_file: str = None, overwrite: bool = None, order_results: str = None, outfields: str = None, bookmark_name: str = None, filter_bookmark: str = None, options_dict: dict | None = None, return_iter=False)

Prepare list of filters, then hand it off to storageman to perform filtering. Creates log of all ligand docking results that passes.

Parameters:
  • Filters – eworst (float): specify the worst energy value accepted ebest (float): specify the best energy value accepted leworst (float): specify the worst ligand efficiency value accepted lebest (float): specify the best ligand efficiency value accepted score_percentile (float): specify the worst energy percentile accepted. Express as percentage e.g. 1 for top 1 percent. le_percentile (float): specify the worst ligand efficiency percentile accepted. Express as percentage e.g. 1 for top 1 percent. vdw_interactions (list[tuple]): define van der Waals interactions with residue as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [(‘A:VAL:279:’, True), (‘A:LYS:162:’, True)] -> [(‘chain:resname:resid:atomname’, <wanted (bool)>), (‘chain:resname:resid:atomname’, <wanted (bool)>)] hb_interactions (list[tuple]): define HB (ligand acceptor or donor) interaction as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [(‘A:VAL:279:’, True), (‘A:LYS:162:’, True)] -> [(‘chain:resname:resid:atomname’, <wanted (bool)>), (‘chain:resname:resid:atomname’, <wanted (bool)>)] reactive_interactions (list[tuple]): check if ligand reacted with specified residue as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [(‘A:VAL:279:’, True), (‘A:LYS:162:’, True)] -> [(‘chain:resname:resid:atomname’, <wanted (bool)>), (‘chain:resname:resid:atomname’, <wanted (bool)>)] hb_count (list[tuple]): accept ligands with at least the requested number of HB interactions. If a negative number is provided, then accept ligands with no more than the requested number of interactions. E.g., [(‘hb_count’, 5)] react_any (bool): check if ligand reacted with any residue max_miss (int): Will compute all possible combinations of interaction filters excluding up to max_miss numer of interactions from given set. Default will only return union of poses interaction filter combinations. Use with ‘enumerate_interaction_combs’ for enumeration of poses passing each individual combination of interaction filters. ligand_name (list[str]): specify ligand name(s). Will combine name filters with OR, e.g., [[“lig1”, “lig2”]] ligand_substruct (list[str]): SMARTS, index of atom in SMARTS, cutoff dist, and target XYZ coords, e.g., [[“ccc”, “CN”]] ligand_substruct_pos (list[list[type]]): SMARTS pattern(s) for substructure matching, e.g., [[“[Oh]C”, 0, 1.2, -5.5, 10.0, 15.5]] -> [[“smart_string”, index_of_positioned_atom, cutoff_distance, x, y, z]] ligand_max_atoms (int): Maximum number of heavy atoms a ligand may have ligand_operator (str): logical join operator for multiple SMARTS (default: OR), either AND or OR filters_dict (dict): provide filters as a dictionary

  • options (Ligand results) –

    enumerate_interaction_combs (bool): When used with max_miss > 0, will log ligands/poses passing each separate interaction filter combination as well as union of combinations. Can significantly increase runtime. output_all_poses (bool): By default, will output only top-scoring pose passing filters per ligand. This flag will cause each pose passing the filters to be logged. mfpt_cluster (float): Cluster filtered ligands by Tanimoto distance of Morgan fingerprints with Butina clustering and output ligand with lowest ligand efficiency from each cluster. Default clustering cutoff is 0.5. Useful for selecting chemically dissimilar ligands. interaction_cluster (float): Cluster filtered ligands by Tanimoto distance of interaction fingerprints with Butina clustering and output ligand with lowest ligand efficiency from each cluster. Default clustering cutoff is 0.5. Useful for enhancing selection of ligands with diverse interactions. log_file (str): by default, results are saved in output_log.txt; if this option is used, ligands and requested info passing the filters will be written to specified file overwrite (bool): by default, if a log file exists, it doesn’t get overwritten and an error is returned; this option enable overwriting existing log files. Will also overwrite existing database order_results (str): Stipulates how to order the results when written to the log file. By default will be ordered by order results were added to the database. ONLY TAKES ONE OPTION. Available fields are:

    ”e” (docking_score), “le” (ligand efficiency), “delta” (delta energy from best pose), “ref_rmsd” (RMSD to reference pose), “e_inter” (intermolecular energy), “e_vdw” (van der waals energy), “e_elec” (electrostatic energy), “e_intra” (intermolecular energy), “n_interact” (number of interactions), “rank” (rank of ligand pose), “run” (run number for ligand pose), “hb” (hydrogen bonds);

    outfields (str): defines which fields are used when reporting the results (to stdout and to the log file); fields are specified as comma-separated values, e.g. –outfields=e,le,hb; by default, docking_score (energy) and ligand name are reported; ligand always reported in first column available fields are:

    ”Ligand_name” (Ligand name), “e” (docking_score), “le” (ligand efficiency), “delta” (delta energy from best pose), “ref_rmsd” (RMSD to reference pose), “e_inter” (intermolecular energy), “e_vdw” (van der waals energy), “e_elec” (electrostatic energy), “e_intra” (intermolecular energy), “n_interact” (number of iteractions), “ligand_smile” , “rank” (rank of ligand pose), “run” (run number for ligand pose), “hb” (hydrogen bonds), “receptor” (receptor name)

    bookmark_name (str): name for resulting book mark file. Default value is ‘passing_results’ filter_bookmark (str): name of bookmark to perform filtering over options_dict (dict): write options as a dict return_inter (bool): return an iterable of all of the filtering results

Returns:

number of ligands passing filter iter (optional): an iterable of all of the filtering results

Return type:

int

finalize_write()

Finalize database write by creating interaction tables and setting database version

find_similar_ligands(query_ligname: str)

Find ligands in cluster with query_ligname

Parameters:

query_ligname (str) – name of the ligand in the ligand table to look for similars to

Returns:

number of ligands that are similar

Return type:

int

static generate_config_file_template()

Outputs to “config.json in current working directory if to_file = true, else it returns the dict of default option values used for API (for command line a few more options are included that are always used explicitly when using API)

Parameters:

to_file (bool) – whether to produce the template as a json string or as a file “config.json”

Returns:

file name of config file or json string with template including default values

Return type:

str

get_bookmark_names()

Method to retrieve all bookmark names in a database

Returns:

of all bookmarks in a database

Return type:

list

static get_options_info() dict

Gets names, default values, and meta data for all Ringtail options.

get_plot_data(bookmark_name: str = None)

Get ligand efficiency and energy for all docking data and for ligands that passed filtering in specified bookmark. Each tuple in the respective lists contains docking_score, leff, pose_id, and ligand name.

Parameters:

bookmark_name (str)

Returns:

[all_data], [filtered_data]

Return type:

list(tuple), list(tuple)

get_previous_filter_data(outfields=None, bookmark_name=None, log_file=None)

Get data requested in self.out_opts[‘outfields’] from the results bookmark of a previous filtering

Parameters:
  • outfields (str) – use outfields as described in RingtailOptions > StorageOptions

  • bookmark_name (str) – bookmark for which the filters were used

ligands_rdkit_mol(bookmark_name=None, write_nonpassing=False) dict

Creates a dictionary of RDKit mols of all ligands specified from a bookmark, either excluding (default) or including those ligands that did not pass the filter(s).

Parameters:
  • bookmark_name (str, optional) – Option to run over specified bookmark other than that just used for filtering

  • write_nonpassing (bool, optional) – Option to include non-passing poses for passing ligands

Returns:

containing ligand names, RDKit mols, flexible residue bols, and other ligand properties

Return type:

all_mols (dict)

plot(save=True, bookmark_name: str = None, return_fig_handle: bool = False)

Get data needed for creating Ligand Efficiency vs Energy scatter plot from storageManager. Call OutputManager to create plot.

Parameters:
  • save (bool) – whether to save plot to cd

  • bookmark_name (str) – bookmark from which to fetch filtered data to plot

  • return_fig_handle (bool) – use to return a handle to the matplotlib figure instead of saving or showing figure

Returns:

will not show figure if returning figure handle

Return type:

matplotlib.pyplot.figure (optional)

produce_summary(columns=['docking_score', 'leff'], percentiles=[1, 10]) None

Print summary of data in storage to sdout

Parameters:
  • columns (list(str)) – data columns used to prepare summary

  • percentiles (list(int)) – cutoff percentiles for the summary

save_receptor(receptor_file)

Add receptor to database.

Parameters:

receptor_file (str) – path to receptor file

set_filters(eworst=None, ebest=None, leworst=None, lebest=None, score_percentile=None, le_percentile=None, vdw_interactions=None, hb_interactions=None, reactive_interactions=None, hb_count=None, react_any=None, max_miss=None, ligand_name=None, ligand_operator=None, ligand_substruct=None, ligand_substruct_pos=None, ligand_max_atoms=None, dict: dict = None)

Create a filter object containing all numerical and string filters.

Parameters:
  • eworst (float) – specify the worst energy value accepted

  • ebest (float) – specify the best energy value accepted

  • leworst (float) – specify the worst ligand efficiency value accepted

  • lebest (float) – specify the best ligand efficiency value accepted

  • score_percentile (float) – specify the worst energy percentile accepted. Express as percentage e.g. 1 for top 1 percent.

  • le_percentile (float) – specify the worst ligand efficiency percentile accepted. Express as percentage e.g. 1 for top 1 percent.

  • vdw_interactions (list[tuple]) – define van der Waals interactions with residue as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [(‘A:VAL:279:’, True), (‘A:LYS:162:’, True)] -> [(‘chain:resname:resid:atomname’, <wanted (bool)>), (‘chain:resname:resid:atomname’, <wanted (bool)>)]

  • hb_interactions (list[tuple]) – define HB (ligand acceptor or donor) interaction as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [(‘A:VAL:279:’, True), (‘A:LYS:162:’, True)] -> [(‘chain:resname:resid:atomname’, <wanted (bool)>), (‘chain:resname:resid:atomname’, <wanted (bool)>)]

  • reactive_interactions (list[tuple]) – check if ligand reacted with specified residue as [-][CHAIN]:[RES]:[NUM]:[ATOM_NAME]. E.g., [(‘A:VAL:279:’, True), (‘A:LYS:162:’, True)] -> [(‘chain:resname:resid:atomname’, <wanted (bool)>), (‘chain:resname:resid:atomname’, <wanted (bool)>)]

  • hb_count (list[tuple]) – accept ligands with at least the requested number of HB interactions. If a negative number is provided, then accept ligands with no more than the requested number of interactions. E.g., [(‘hb_count’, 5)]

  • react_any (bool) – check if ligand reacted with any residue

  • max_miss (int) – Will compute all possible combinations of interaction filters excluding up to max_miss numer of interactions from given set. Default will only return union of poses interaction filter combinations. Use with ‘enumerate_interaction_combs’ for enumeration of poses passing each individual combination of interaction filters.

  • ligand_name (list[str]) – specify ligand name(s). Will combine name filters with OR, e.g., [“lig1”, “lig2”]

  • ligand_substruct (list[str]) – SMARTS, index of atom in SMARTS, cutoff dist, and target XYZ coords, e.g., [“ccc”, “CN”]

  • ligand_substruct_pos (list[str]) – SMARTS pattern(s) for substructure matching, e.g., [‘”[Oh]C” 0 1.2 -5.5 10.0 15.5’] -> [“smart_string index_of_positioned_atom cutoff_distance x y z”]

  • ligand_max_atoms (int) – Maximum number of heavy atoms a ligand may have

  • ligand_operator (str) – logical join operator for multiple SMARTS (default: OR), either AND or OR

  • dict (dict) – dictionary of one or more of the above args, is overwritten by individual args

set_output_options(log_file: str = None, export_sdf_path: str = None, enumerate_interaction_combs: bool = None, dict: dict = None)

Creates output options object that holds attributes related to reading and outputting results. Will assign log_file name and export_sdf_path to the output_manager object.

Parameters:
  • log_file (str) – by default, results are saved in “output_log.txt”; if this option is used, ligands and requested info passing the filters will be written to specified file

  • export_sdf_path (str) – specify the path where to save poses of ligands passing the filters (SDF format); if the directory does not exist, it will be created; if it already exist, it will throw an error, unless the –overwrite is used NOTE: the log file will be automatically saved in this path. Ligands will be stored as SDF files in the order specified.

  • enumerate_interaction_combs (bool) – When used with max_miss > 0, will log ligands/poses passing each separate interaction filter combination as well as union of combinations. Can significantly increase runtime.

  • dict (dict) – dictionary of one or more of the above args, is overwritten by individual args

set_resultsman_attributes(store_all_poses: bool = None, max_poses: int = None, add_interactions: bool = None, interaction_tolerance: float = None, interaction_cutoffs: list = None, max_proc: int = None, dict: dict = None)

Create results_manager_options object if needed, sets options, and assigns them to the results manager object.

Parameters:
  • store_all_poses (bool) – store all ligand poses, does it take precedence over max poses?

  • max_poses (int) – how many poses to save (ordered by soem score?)

  • add_interactions (bool) – add ligand-receptor interaction data, only in vina mode

  • interaction_tolerance (float) – longest ångström distance that is considered interaction?

  • interaction_cutoffs (list) – ångström distance cutoffs for x and y interaction

  • max_proc (int) – max number of computer processors to use for file reading

  • dict (dict) – dictionary of one or more of the above args, is overwritten by individual args

set_storageman_attributes(filter_bookmark: str = None, duplicate_handling: str = None, overwrite: bool = None, order_results: str = None, outfields: str = None, output_all_poses: str = None, mfpt_cluster: float = None, interaction_cluster: float = None, bookmark_name: str = None, dict: dict = None)

Create storage_manager_options object if needed, sets options, and assigns them to the storage manager object.

Parameters:
  • filter_bookmark (str) – Perform filtering over specified bookmark. (in output group in CLI)

  • duplicate_handling (str, options) – specify how duplicate Results rows should be handled when inserting into database. Options are “ignore” or “replace”. Default behavior will allow duplicate entries.

  • overwrite (bool) – by default, if a log file exists, it doesn’t get overwritten and an error is returned; this option enable overwriting existing log files. Will also overwrite existing database

  • order_results (str) – Stipulates how to order the results when written to the log file. By default will be ordered by order results were added to the database. ONLY TAKES ONE OPTION.” “available fields are: ” ‘“e” (docking_score), ‘ ‘“le” (ligand efficiency), ‘ ‘“delta” (delta energy from best pose), ‘ ‘“ref_rmsd” (RMSD to reference pose), ‘ ‘“e_inter” (intermolecular energy), ‘ ‘“e_vdw” (van der waals energy), ‘ ‘“e_elec” (electrostatic energy), ‘ ‘“e_intra” (intermolecular energy), ‘ ‘“n_interact” (number of interactions), ‘ ‘“rank” (rank of ligand pose), ‘ ‘“run” (run number for ligand pose), ‘ ‘“hb” (hydrogen bonds); ‘

  • outfields (str) – defines which fields are used when reporting the results (to stdout and to the log file); fields are specified as comma-separated values, e.g. “–outfields=e,le,hb”; by default, docking_score (energy) and ligand name are reported; ligand always reported in first column available fields are: ‘ ‘“Ligand_name” (Ligand name), ‘ ‘“e” (docking_score), ‘ ‘“le” (ligand efficiency), ‘ ‘“delta” (delta energy from best pose), ‘ ‘“ref_rmsd” (RMSD to reference pose), ‘ ‘“e_inter” (intermolecular energy), ‘ ‘“e_vdw” (van der waals energy), ‘ ‘“e_elec” (electrostatic energy), ‘ ‘“e_intra” (intermolecular energy), ‘ ‘“n_interact” (number of interactions), ‘ ‘“ligand_smile” , ‘ ‘“rank” (rank of ligand pose), ‘ ‘“run” (run number for ligand pose), ‘ ‘“hb” (hydrogen bonds), ‘ ‘“receptor” (receptor name); ‘ “Fields are printed in the order in which they are provided. Ligand name will always be returned and will be added in first position if not specified.

  • output_all_poses (bool) – By default, will output only top-scoring pose passing filters per ligand. This flag will cause each pose passing the filters to be logged.

  • mfpt_cluster (float) – Cluster filered ligands by Tanimoto distance of Morgan fingerprints with Butina clustering and output ligand with lowest ligand efficiency from each cluster. Default clustering cutoff is 0.5. Useful for selecting chemically dissimilar ligands.

  • interaction_cluster (float) – Cluster filered ligands by Tanimoto distance of interaction fingerprints with Butina clustering and output ligand with lowest ligand efficiency from each cluster. Default clustering cutoff is 0.5. Useful for enhancing selection of ligands with diverse interactions.

  • bookmark_name (str) – name for resulting book mark file. Default value is “passing_results”

  • dict (dict) – dictionary of one or more of the above args, is overwritten by individual args

update_database_version(consent=False, new_version='2.0.0')

Method to update database version from earlier versions to either 1.1.0 or 2.0.0

write_flexres_pdb(receptor_polymer, ligname: str, filename: str, bookmark_name: str = None)

Writes a receptor pdb with flexible residues based on the ligand provided

Parameters:
  • receptor_polymer (Polymer) – version of receptor produced by meeko

  • ligname (str) – ligand name for which the receptor flexible residue info should be collected

  • filename (str) – name of the output pdb, extension is optional, will default to ‘.pdb’

  • bookmark_name (str, optional) – will use last used bookmark if not specified, will not work in a db without any filtering performed

write_molecule_sdfs(sdf_path: str | None = None, all_in_one: bool = True, bookmark_name: str = None, write_nonpassing: bool = None)

Have output manager write molecule sdf files for passing results in given results bookmark

Parameters:
  • sdf_path (str, optional) – Optional path existing or to be created in cd where SDF files will be saved

  • all_in_one (bool, optional) – If True will write all molecules to one SDF (separated by $$$$), if False will write one molecule pre SDF

  • bookmark_name (str, optional) – Option to run over specified bookmark other than that just used for filtering

  • write_nonpassing (bool, optional) – Option to include non-passing poses for passing ligands

Raises:

StorageError – if bookmark or data not found

exception ringtail.StorageError

Bases: Exception

class ringtail.StorageManager

Bases: object

check_passing_bookmark_exists(bookmark_name: str | None = None)

Checks if bookmark name is in database

Parameters:

bookmark_name (str, optional) – name of bookmark name to check if exist, or else will use storageman bookmark_name attribute

Returns:

indicates if bookmark_name exists in the current database

Return type:

bool

check_storage_compatibility()

Checks if chosen storage type has been implemented

Parameters:

storage_type (str) – name of the storage type

Raises:

NotImplementedError – raised if seelected storage type has not been implemented

Returns:

of implemented storage type

Return type:

class

close_storage(attached_db=None, vacuum=False)

Close connection to database

Parameters:
  • attached_db (str, optional) – name of attached DB (not including file extension)

  • vacuum (bool, optional) – indicates that database should be vacuumed before closing

crossref_filter(new_db: str, bookmark1_name: str, bookmark2_name: str, selection_type='-', old_db=None) tuple

Selects ligands found or not found in the given bookmark in both current db and new_db. Stores as temp view

Parameters:
  • new_db (str) – file name for database to attach

  • bookmark1_name (str) – string for name of first bookmark/temp table to compare

  • bookmark2_name (str) – string for name of second bookmark to compare

  • selection_type (str) – “+” or “-” indicating if ligand names should (“+”) or should not “-” be in both databases

  • old_db (str, optional) – file name for previous database

Returns:

(name of new bookmark (str), number of ligands passing new bookmark (int))

Return type:

tuple

field_to_column_name = {'Ligand_name': 'LigName', 'delta': 'deltas', 'e': 'docking_score', 'e_elec': 'energies_electro', 'e_inter': 'energies_inter', 'e_intra': 'energies_intra', 'e_vdw': 'energies_vdw', 'hb': 'num_hb', 'interactions': 'interactions', 'le': 'leff', 'ligand_smile': 'ligand_smile', 'n_interact': 'nr_interactions', 'rank': 'pose_rank', 'receptor': 'receptor', 'ref_rmsd': 'reference_rmsd', 'run': 'run_number'}
filter_results(all_filters: dict, suppress_output=False) iter

Generate and execute database queries from given filters.

Parameters:
  • all_filters (dict) – dict containing all filters. Expects format and keys corresponding to ringtail.Filters().todict()

  • suppress_output (bool) – prints filtering summary to sdout

Returns:

iterable, such as an sqlite cursor, of passing results

Return type:

iter

finalize_database_write()

Methods to finalize when a database has been written to, and saving the current database schema to the sqlite database.

get_plot_data(bookmark_name: str = None, only_passing=False)

This function is expected to return an ascii plot representation of the results

Parameters:
  • bookmark_name (str) – name of bookmark for which to fetch passing data. Will use default bookmark name if None. Returns empty list if bookmark does not exist.

  • only_passing (bool) – Only return data for passing ligands. Will return empty list for all data.

Returns:

cursors as (<all data cursor>, <passing data cursor>)

Return type:

tuple

insert_data(results_array, ligands_array, interaction_list, receptor_array=[], insert_receptor=False)

Inserts data from all arrays returned from results manager.

Parameters:
  • results_array (list) – list of data to be stored in Results table

  • ligands_array (list) – list of data to be stored in Ligands table

  • interaction_list (list) – list of data to be stored in interaction tables

  • receptor_array (list) – list of data to be stored in Receptors table

  • insert_receptor (bool, optional) – flag indicating that receptor info should inserted

insert_interactions(Pose_IDs: list, interactions_list, duplicates)

Takes list of interactions, inserts into database

Parameters:
  • Pose_IDs (list(int)) – list of pose ids assigned while writing the current results to database

  • interactions_list (list) – List of tuples for interactions in form (“type”, “chain”, “residue”, “resid”, “recname”, “recid”)

  • duplicates (list(Pose_ID)) – any duplicates identified in “insert_results”, if duplicate handling has been specified

prune()

Deletes rows from results, ligands, and interactions in a bookmark if they do not pass filtering criteria

class ringtail.StorageManagerSQLite(db_file: str = None, overwrite: bool = None, order_results: str = None, outfields: str = None, filter_bookmark: str = None, output_all_poses: bool = None, mfpt_cluster: float = None, interaction_cluster: float = None, bookmark_name: str = None, duplicate_handling: str = None)

Bases: StorageManager

SQLite-specific StorageManager subclass

conn

Connection to database

Type:

SQLite.conn

open_cursors

list of cursors that were not closed by the function that created them. Will be closed by close_connection method.

Type:

list

db_file

database name

Type:

str

overwrite

switch to overwrite database if it exists

Type:

bool

order_results

what column name will be used to order results once read

Type:

str

outfields

data fields/columns to include when reading and outputting data

Type:

str

filter_bookmark

name of bookmark that filtering will be performed over

Type:

str

output_all_poses

whether or not to output all poses of a ligand

Type:

bool

mfpt_cluster

distance in ångströms to cluster ligands based on morgan fingerprints

Type:

float

interaction_cluster

distance in ångströms to cluster ligands based on interactions

Type:

float

bookmark_name

name of current bookmark being written to or read from

Type:

str

duplicate_handling

optional attribute to deal with insertion of ligands already in the database

Type:

str

current_bookmark_name

name of last view to have been written to in the database

Type:

str

filtering_window

name of bookmark/view being filtered on

Type:

str

index_columns
Type:

list

view_suffix

current suffix for views

Type:

int

temptable_suffix

current suffix for temporary tables

Type:

int

field_to_column_name

Dictionary for converting ringtail options into DB column names

Type:

dict

bookmark_has_rows(bookmark_name: str) bool

Method that checks if a given bookmark has any data in it

Parameters:

bookmark_name (str) – view to check

Returns:

True if more than zero rows in bookmark

Return type:

bool

check_ringtaildb_version()

Checks the database version and confirms whether the code base is compatible with it

Returns:

whether or not db is compatible with the code base str: current database versions

Return type:

bool

check_storage_ready(run_mode: str, docking_mode: str, store_all_poses: bool, max_poses: int)

Check that storage is ready before proceeding, and creates new tables if needed

Parameters:
  • run_mode (str) – if ringtail is ran using cmd line interface or api

  • docking_mode (str) – what docking engine was used to produce results

  • store_all_poses (bool) – overrwrites max poses

  • max_poses (int) – max poses to save to db

Raises:
clone(backup_name=None)

Creates a copy of the db

Parameters:

backup_name (str, optional) – name of the cloned database

count_receptors_in_db()

returns number of rows in Receptors table where receptor_object already has blob

Returns:

number of rows in receptors table str: name of receptor if present in table

Return type:

int

Raises:

DatabaseQueryError

create_bookmark(name, query, temp=False, add_poseID=False, filters={})

Takes name and selection query and creates a bookmark of name. Bookmarks are Ringtail specific views that whose information is stored in the ‘Bookmark’ table. #FIXME bug where ligand filter only results are not added as bookmarks

Parameters:
  • name (str) – Name for bookmark which will be created

  • query (str) – SQLite-formated query used to create bookmark

  • temp (bool, optional) – Flag if bookmark should be temporary

  • add_poseID (bool, optional) – Add Pose_ID column to bookmark

  • filters (dict, optional) – a dict of filters used to construct the query

create_bookmark_from_temp_table(temp_table_name, bookmark_name, original_bookmark_name, wanted_list, unwanted_list=[])

Resaves temp bookmark stored in self.current_bookmark_name as new permenant bookmark

Parameters:
  • bookmark_name (str) – name of bookmark to save last temp bookmark as

  • original_bookmark_name (str) – name of original bookmark

  • wanted_list (list) – List of wanted database names

  • unwanted_list (list, optional) – List of unwanted database names

  • temp_table_name (str) – name of temporary table

create_temp_table_from_bookmark()

Method that creates a temporary table named “passing_temp”. Please note that this table will be dropped as soon as the database connection closes.

drop_bookmark(bookmark_name: str)

Drops specified bookmark from database

Parameters:

bookmark_name (str) – bookmark to be dropped

Raises:

DatabaseInsertionError

fetch_bookmark(bookmark_name: str) Cursor

returns SQLite cursor of all fields in bookmark

Parameters:

bookmark_name (str) – name of bookmark to retrieve

Returns:

cursor of requested view

Return type:

sqlite3.Cursor

fetch_clustered_similars(ligname: str)

Given ligname, returns poseids for similar poses/ligands from previous clustering. User prompted at runtime to choose cluster.

Parameters:

ligname (str) – ligname for ligand to find similarity with

Raises:
fetch_data_for_passing_results() iter

Will return SQLite cursor with requested data for outfields for poses that passed filter in self.bookmark_name

Returns:

sqlite cursor of data from passing data

Return type:

iter

Raises:

OptionError

fetch_filters_from_bookmark(bookmark_name: str | None = None)

Method that will retrieve filter values used to construct bookmark

Parameters:
  • bookmark_name (str, optional) – can get filter values for given bookmark, or filter values from currently active bookmark in storageman

  • Returns – dict: containing the filter data

fetch_flexres_info()

fetch flexible residues names and atomname lists

Returns:

(flexible_residues, flexres_atomnames)

Return type:

tuple

fetch_interaction_info_by_index(interaction_idx) tuple

Returns tuple containing interaction info for given interaction_idx

Parameters:

interaction_idx (int) – interaction index to fetch info for

Returns:

tuple of info for requested interaction

Return type:

tuple

fetch_nonpassing_pose_properties(ligname)

fetch coordinates for poses of ligname which did not pass the filter

Parameters:

ligname (str) – name of ligand to fetch coordinates for

Returns:

SQLite cursor that contains Pose_ID, docking_score, leff, ligand_coordinates,

flexible_res_coordinates, flexible_residues

Return type:

iter

fetch_passing_ligand_output_info() iter

fetch information required by vsmanager for writing out molecules

Returns:

contains LigName, ligand_smile,

atom_index_map, hydrogen_parents

Return type:

iter

fetch_passing_pose_properties(ligname)

fetch coordinates for poses passing filter for given ligand

Parameters:

ligname (str) – name of ligand to fetch coordinates for

Returns:

SQLite cursor that contains Pose_ID, docking_score, leff, ligand_coordinates,

flexible_res_coordinates, flexible_residues

Return type:

iter

fetch_pose_interactions(Pose_ID) iter

Fetch all interactions parameters belonging to a Pose_ID

Parameters:

Pose_ID (int) – pose id, 1-1 with Results table

Returns:

of interaction information for given Pose_ID

Return type:

iter

fetch_receptor_object_by_name(rec_name)

Returns Receptor object from database for given rec_name

Parameters:

rec_name (str) – Name of receptor to return object for

Returns: str: receptor object as a string

fetch_receptor_objects()

Returns all Receptor objects from database

Parameters:

rec_name (str) – Name of receptor to return object for

Returns:

of receptor names and objects

Return type:

iter (tuple)

fetch_single_ligand_output_info(ligname) str

get output information for given ligand

Parameters:

ligname (str) – ligand name

Raises:

DatabaseQueryError

Returns:

information containing smiles, atom and index mapping, and hydrogen parents

Return type:

str

fetch_single_pose_properties(pose_ID: int) iter

fetch coordinates for pose given by pose_ID

Parameters:

pose_ID (int) – name of ligand to fetch coordinates for

Returns:

SQLite cursor that contains Pose_ID, docking_score, leff, ligand_coordinates,

flexible_res_coordinates, flexible_residues

Return type:

iter

fetch_summary_data(columns=['docking_score', 'leff'], percentiles=[1, 10]) dict
Collect summary data for database:

Num Ligands Num stored poses Num unique interactions

min, max, percentiles for columns in columns

Parameters:
  • columns (list (str)) – columns to be displayed and used in summary

  • percentiles (list(int)) – percentiles to consider

Returns:

of data summary

Return type:

dict

classmethod format_for_storage(ligand_dict: dict) tuple

takes file dictionary from the file parser, formats required storage format

Parameters:

ligand_dict (dict) – Dictionary containing data from the fileparser

Returns:

of lists ([result_row_1, result_row_2,…],

ligand_row, [interaction_tuple_1, interaction_tuple_2, …])

Return type:

tuple

get_all_bookmark_names()

Get all bookmarks in sql database as a list of names. Bookmarks are a specific type of sqlite-views whose information is stored in the Bookmarks table.

Returns:

of bookmark names

Return type:

list

get_current_bookmark_name()

returns current bookmark name

Returns:

name of last passing results bookmark used by database

Return type:

str

get_maxmiss_union(total_combinations: int)

Get results that are in union considering max miss

Parameters:

total_combinations (int) – numer of possible combinations

Returns:

of passing results

Return type:

iter

insert_receptor_blob(receptor, rec_name)

Takes object of Receptor class, updates the column in Receptor table

Parameters:
  • receptor (bytes) – bytes receptor object to be inserted into DB

  • rec_name (string) – Name of receptor. Used to insert into correct row of DB

Raises:

DatabaseInsertionError – Description

overwrite_storage()

Will drop all tables in the database.

set_bookmark_suffix(suffix)

Sets internal bookmark_suffix variable

Parameters:

suffix (str) – suffix to attached to bookmark-related queries or creation

to_dataframe(requested_data: str, table=True) pandas.DataFrame

Returns a panda dataframe of table or query given as requested_data

Parameters:
  • requested_data (str) – String containing SQL-formatted query or table name

  • table (bool) – Flag indicating if requested_data is table name or not

Returns:

dataframe of requested data

Return type:

pd.DataFrame

update_database_version(new_version, consent=False)

method that updates sqlite database schema 1.0.0 or 1.1.0 to 1.1.0 or 2.0.0

#NOTE: If you created a version 1 database with the duplicate handling option, there is a chance of inconsistent behavior of anything involving interactions as the Pose_ID was not used as an explicit foreign key in db v1.0.0 and v1.1.0.

Parameters:

consent (bool, optional) – variable to ensure consent to update database is explicit

Returns:

bool

exception ringtail.WriteToStorageError

Bases: Exception

class ringtail.Writer(*args: Any, **kwargs: Any)

Bases: Process

This class is a listener that retrieves data from the queue and writes it into datbase

process_data(data_packet)

Breaks up the data in the data_packet to distribute between the different arrays to be inserted in the database.

Parameters:

data_packet (any) – File packet to be processed

run()

Method overload from parent class. This is where the task of this class is performed. Each multiprocess.Process class must have a “run” method which is called by the initialization (see below) with start()

Raises:

WriteToStorageError

write_to_storage()

Inserting data to the database through the designated storagemanager.

ringtail.parse_single_dlg(fname)

Parse an ADGPU DLG file uncompressed or gzipped

Parameters:

fname (str) – ligand docking result file name

Raises:
Returns:

parsed results ready to be inserted in database

Return type:

dict

ringtail.parse_vina_result(data_pointer) dict

Parser for vina docking results, supporting either pdbqt or gzipped (.gz) files, or with the docking results provided as a string.

Parameters:

data_pointer (any) – either filename or dictionary of string docking results

Returns:

parsed results ready to be inserted in database

Return type:

dict