Last updated: [98/06/18 Marick]
Brian Marick wrote this document. Bill Frantz (frantz-at-pwpconsult.com) is responsible for the code.
Level of Effort
Overall, this is probably one of the safest subsystems to leave lightly tested. Note: not all of the datacomm layer is designed. (See Issues.) How the new pieces will be tested is TBD.
Who Does What?
Brian Marick will suggest tests to Bill Frantz, who will implement them. Bill will also think of new tests on his own.
1-2 full days of Brian's time. Bill's time is already included in his schedule for the task.
Repeatability and Automation
I suggest that only two tests (performance and load) be automated. They will cover normal operation and be easy to automate. Future changes may result in manual rerunning of other tests that aren't worth automating.
What Is To Be Tested (Overview)
I will flesh out some of these categories later. Some of them may not need fleshing out. Some of them may have already been adequately done.
Highest priority categories are in red.
A straightforward exhaustive search of the possible interleavings of simultaneous startup should be done. This can be implemented manually in the debugger. These tests will not be saved.
For variety, try some of the cases when the connections are new, some when the connections were suspended.
This is not high priority only because it has already been tested to some extent.
The registrar-handling code allows the comm system to query several locations when making a connection, and allows those locations to redirect the search to other locations. Should be tested if the code has changed. Can be deferred until there's a functioning PLS. Defer if it saves trouble. These tests do not need to be preserved.
Try both clean and dirty TCP shutdowns (kill process vs. kill machine).
Try with messages queued in incoming queue (shutdown notice queued after those messages). (Trace behavior in debugger.)
Try sending a new message after shutdown (should get exception).
Related to API state testing (described below), in that errors should be caused to arrive when the datacomm subsystem is in various states.
Shutdown with messages queued going out (via both variants of sendMsg).
Explore simultaneous shutdown races as with startup.
A single test with N partners. Each partner will open connections, send and receive messages, and shut them down. Messages should be checked for non-corruption. In at least some of the cases, connections should be suspended. There is enough randomness to exercise thread safety. The test should be cyclic and by default run for a large number of cycles (it can be run for a shorter number of cycles as a smoke test).
This test should be fully automated.
Suspension of connections
Covered adequately by load testing and API state testing.
Covered by load testing.
For each method, I'll derive test cases, looking especially for odd or easily overlooked cases. Bill can inspect the list and try those he deems important.
I will especially concentrate on deriving a model of the internal state of the datacomm system (e.g., "connection suspended") and explore what happens when certain methods are called in certain states.
Check by inspection. This is old code, except for code that solves the "man in the middle" problem. The common case (no spoofing) is tested in normal use. The code to detect spoofing should be easy to check by inspection, probably not worth writing the test support.
Here are some performance test ideas:
During test 2, OptimizeIt should be used to size this subsystem. (Walendo is the guide, here.) We'll track those sizes, as well as performance.
Test Support Needed
TBD. Should be only some minor driver code surrounding the subsystem, sufficient to drive these tests.
Some things are not yet designed:
There's no test plan for them yet.
There are still some open issues called out in NewECommSystem.html. As they're resolved, they'll need to be checked.
Unless stated otherwise, all text on this page which is either unattributed or by Mark S. Miller is hereby placed in the public domain.