Multi-step protocol for HTVS¶
For high-throughput virtual screening (HTVS) applications, where computing performance is important, the recommended rDock protocol is to limit the search space (i.e. rigid receptor), apply the grid-based scoring function and/or to use a multi-step protocol to stop sampling of poor scorers as soon as possible.
Using a multi-step protocol for the DUD system COMT, the computational time can be reduced by 7.5-fold without affecting performance by:
Running 5 docking runs for all ligands;
ligands achieving a score of -22 or lower run 10 further runs;
for those ligands achieving a score of -25 or lower, continue up to 50 runs.
The optimal protocol is specific for each particular system and parameter-set, but can be identified with a purpose-built script (see the Reference guide, section
Here you will find a tutorial to show you how to create and run a multi-step protocol for a HTVS campaign.
Step 1: Create the multi-step protocol¶
These are the instructions for running rbhtfinder:
1st) exhaustive docking of a small representative part of the whole library. 2nd) Store the result of sdreport -t over that exhaustive dock. in file that will be the input of this script. 3rd) rbhtfinder <sdreport_file> <output_file> <thr1max> <thr1min> <ns1> <ns2> <ns1> and <ns2> are the number of steps in stage 1 and in stage 2. If not present, the default values are 5 and 15 <thrmax> and <thrmin> setup the range of thresholds that will be simulated in stage 1. The threshold of stage 2 depends on the value of the threshold of stage 1. An input of -22 -24 will try protocols: 5 -22 15 -27 5 -22 15 -28 5 -22 15 -29 5 -23 15 -28 5 -23 15 -29 5 -23 15 -30 5 -24 15 -29 5 -24 15 -30 5 -24 15 -31 Output of the program is a 7 column values. First column represents the time. This is a percentage of the time it would take to do the docking in exhaustive mode, i.e. docking each ligand 100 times. Anything above 12 is too long. Second column is the first percentage. Percentage of ligands that pass the first stage. Third column is the second percentage. Percentage of ligands that pass the second stage. The four last columns represent the protocol. All the protocols tried are written at the end. The ones for which time is less than 12%, perc1 is less than 30% and perc2 is less than 5% but bigger than 1% will have a series of *** after, to indicate they are good choices WARNING! This is a simulation based in a small set. The numbers are an indication, not factual values.
Step 1, substep 1: Exhaustive docking¶
Hence, as stated, the first step is to run an exhaustive docking of a representative part of the whole desired library to dock.
For rDock, exhaustive docking means doing 100 runs for each ligand, whereas standard docking means 50 runs for each ligand:
$ rbdock -i INPUT.sd -o OUTPUT -r PRMFILE.prm -p dock.prm -n 100
Step 1, substep 2:
Once the exhaustive docking has finished, the results have to be saved in a single file and the output of the script
sdreport -t will be used as input for
$ sdreport -t OUTPUT.sd > sdreport_results.txt
Step 1, substep 3:
The last step is to run the
rbhtfinder script (download
sdreport_results.txt for testing):
$ rbhtfinder sdreport_results.txt htvs_protocol.txt -10 -20 7 25
Which will result in a file called
The parameters are explained in the script instructions. They are not always the same and as they depend on the system, you will probably have to play a little with different values in order to obtain good parameters sets (marked with
*** in the output).
This will happen when time is less than 12%, perc1 (number of ligands that pass the first filter) is less than 30% and perc2 (number of ligands that pass the second filter) is less than 5% but bigger than 1%.
Step 2: Run rDock with the Multi-Step Protocol¶
The script finished with two good parameters sets:
TIME PERC1 PERC2 N1 THR1 N2 THR2 [...] 11.928, 27.461, 3.207, 7, -12, 25, -17 *** [...] 10.508, 18.773, 1.511, 7, -13, 25, -18 *** [...]
These parameters have to be adapted to a file with the HTVS protocol format that rDock understands.
A template file looks as follows (
N2 are the parameters found above):
3 if – <THR1> SCORE.INTER 1.0 if – SCORE.NRUNS <N1-1> 0.0 -1.0, if – <THR2> SCORE.INTER 1.0 if – SCORE.NRUNS <N2-1> 0.0 -1.0, if – SCORE.NRUNS 49 0.0 -1.0, 1 - SCORE.INTER -10,
It is divided in 2 sections, Running Filters and Writing Filters (defined by the lines with one number).
The first line (the number 3) indicates the number of lines in the Running Filters:
The first filter is defined as follows: if the number of runs reaches
N1and the score is lower than
THR1, continue to filter 2, else stop with that ligand and go to the next one.
The second filter is defined similar to the first one: if the number of runs reaches
N2and the score is lower than
THR2, continue to filter 3, else stop and go to the next ligand.
If a ligand has passed the first two filters, continue up to 50 runs.
The fifth line (the number 1 after the three Running Filters) indicates the number of lines in the Writing Filters:
Only print out all those poses where
SCORE.INTERis lower than -10 (for avoiding excessive printing).
For the parameters obtained in the first Section of this tutorial (first line with
***), we will have to generate a file as follows:
3 if – -12 SCORE.INTER 1.0 if – SCORE.NRUNS 6 0.0 -1.0, if – -17 SCORE.INTER 1.0 if – SCORE.NRUNS 24 0.0 -1.0, if – SCORE.NRUNS 49 0.0 -1.0, 1 - SCORE.INTER -10,
Please note that the parameters
N2 are 7 and 25 but we write 6 and 24, respectively, as stated in the template.
Finally, run rDock changing the flag
-n XX for
$ rbdock -i INPUT.sd -o OUTPUT -r PRMFILE.prm -p dock.prm -t PROTOCOLFILE.txt