The latest
post in our series on promises introduced the
async
and await
commands. That post focused on how these commands
further simplify asynchronous programming with promises.
This post takes a different angle on their utility - how they can
be used to speed up sequential code with minimal effort.
NOTE: The code samples here assume V1.1 of the
promise
package.
The example I'm using is based on a real utility script I use to check the status of my code repositories, simplified to remove extraneous details. The original sequential version of my script looked as follows.
proc get_status {} {
global argv vcs_status
foreach dir $argv {
set dir [file nativename [file normalize $dir]]
if {[file exists [file join $dir .hg]]} {
set vcs_status($dir) [exec cmd /c cd $dir && hg status]
} elseif {[file exists [file join $dir .git]]} {
set vcs_status($dir) [exec cmd /c cd $dir && git status]
} else {
set vcs_status($dir) "Error: could not recognize VCS for $dir."
}
}
}
get_status
foreach {dir status} [array get vcs_status] {
puts [string repeat = 40]
puts "STATUS for $dir:"
puts $status
}
This simple script should be self-explanatory. It logs the status for the list of repositories passed on the command line.
Given I have more than a dozen repositories and the actual commands
are more than the simple status
used in the modified script above,
the script took more than just a few seconds to run. Patience not being a
virtue, I have long wanted to speed up the script but couldn't be
bothered to use exec
or an open
pipeline asynchronously, hooking up the
event handlers etc.. Not that it's hugely difficult, but still...
However, the async
/ await
commands from the promise
package
made speeding up the above sequential code almost trivial. The
slowness in the script above stems from two factors - first, the
sequential script does not utilize multiple processors, and moreover,
even on a single processor system time is wasted in child processes
waiting for I/O. These factors can be addressed very simply by using
async
/ await
in combination with the promise-based pexec
equivalent of exec
.
Here is the modified script.
package require promise
namespace path promise
async get_status {} {
global argv vcs_status
while {[llength $argv]} {
set argv [lassign $argv dir]
set dir [file nativename [file normalize $dir]]
if {[file exists [file join $dir .hg]]} {
set vcs_status($dir) [await [pexec cmd /c cd $dir && hg status]]
} elseif {[file exists [file join $dir .git]]} {
set vcs_status($dir) [await [pexec cmd /c cd $dir && git status]]
} else {
set vcs_status($dir) "Error: could not recognize VCS for $dir."
}
}
}
eventloop [all* [get_status] [get_status] [get_status] [get_status]]
foreach {dir status} [array get vcs_status] {
puts [string repeat = 40]
puts "STATUS for $dir:"
puts $status
}
The changes we have made are:
Obviously, we first need to load the promise
package itself.
We then define the get_status
procedure using the async
command
rather than proc
.
The foreach
loop is replaced with a while
since there are now
effectively multiple parallel loops that will be picking elements off the
argv
list.
The exec
call is replaced by the await
and pexec
combination.
Finally, we add the eventloop
line to start four asynchronous
routines each of which will execute the equivalent of our original
code. (I picked 4 because that's the number of processors on my machine.)
NOTE The eventloop
command is new in V1.1 of the promise
package.
It enters the Tcl event loop waiting for a promise to be settled. In older
versions of the package, the equivalent would be something along the
lines of
set gate [all* [get_status] [get_status] [get_status] [get_status]]
$gate done {set completed true}
vwait completed
Though there are several modifications, they are mechanical and close to being trivial. Perhaps the most important characteristic of this transformation is that the parallelized code closely resembles the structure of the original sequential version and is as easy to follow while performing significantly better, executing roughly three times as fast in my case.
This post was actually motivated by a user's query regarding a test framework. The following Tcl pseudocode describes the general structure of scripts that emulate a client running against the server under test.
proc test_client {server iterations} {
set conn [connect_to_server $server]
for {set i 0} {i < $iterations} {incr i} {
check_response [query $conn QUERY1]
after [random_think_time] ;# Simulate thinking time
check_response [query $conn QUERY2]
after [random_think_time]
...
...
}
close_connection $conn
}
For load testing that emulates multiple clients, the test harness runs the script in multiple threads or processes. This has scalability limitations in terms of how many clients can be emulated on a system and the question was whether promises could used to improve this by running multiple clients within each Tcl interpreter without significant changes to the code structure.
Again, async
/ await
idiom is ideally suited for this and the
equivalent code is shown below.
async test_client {server iterations} {
set conn [connect_to_server $server]
for {set i 0} {i < $iterations} {incr i} {
check_response [query $conn QUERY1]
await [ptimer [random_think_time]] ;# Simulate thinking time
check_response [query $conn QUERY2]
await [ptimer [random_think_time]] ;# Simulate thinking time
...
...
}
close_connection $conn
}
Basically, the after
blocking command is replaced by await
on
timers created with ptimer
. This allows multiple test_client invocations
to run concurrently. A single Tcl interpreter can then potentially emulate
hundreds of clients resulting in greatly increased test
scalability with the following simple snippet.
for {set i 0} {i < 100} {incr i} {
lappend clients [test_client $server 10]
}
eventloop [all $clients]
Hopefully this short post provides motivation for you to explore promises in more depth.