Skip to content

Commit

Permalink
Update transaction related topics
Browse files Browse the repository at this point in the history
  • Loading branch information
iondev33 committed Apr 21, 2024
1 parent 0d4da52 commit 71d3fcf
Show file tree
Hide file tree
Showing 2 changed files with 42 additions and 1 deletion.
25 changes: 25 additions & 0 deletions gh-pages/docs/ION-Coding-Guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -316,3 +316,28 @@ Next come declarations of various sorts, followed by the ends of the C++ and “
```

Nothing should go after the "#endif" of the include guard.

## SDR Transaction

All writing to an SDR heap must occur during a transaction that was initiated by the task issuing the write. Transactions are single-threaded; if task B wants to start a transaction while a transaction begun by task A is still in progress, it must wait until A's transaction is either ended or cancelled.

A transaction is begun by calling `sdr_begin_xn()`, and the current transaction is normally ended by calling the `sdr_end_xn()` function, which returns an error code in the event that any serious SDR-related processing error was encountered in the course of the transaction Transactions may safely be nested, provided that every level of transaction activity that is begun
is properly ended.

Another way to terminate a transaction is using the `sdr_exit_xn()` call, as a way to implement a critical section within which SDR data is read while no data modifications occurred. Using the `sdr_exit_xn()` to end a transaction indicates that no SDR modification should have occurred during the critical section. Therefore the `sdr_exit_xn` function will check whether any SDR modifications were made during the transaction and if the current transaction is the outer most layer (depth = 1) then it will result in a unrecoverable SDR error.

Given that `sdr_end_xn()` is designed to handle transaction with SDR modification while `sdr_exit_xn()` does not, why would one want to use `sdr_exit_xn()` to end a transaction? This is useful as a way to detect SDR changes that is not supposed to occur but does occur - it catches both logical error and implementation error in the code. It is a useful "check" whether the SDR behavior is as expected. In case `sdr_exit_xn()` detected a SDR modification as the outer most layer of transaction, an error message will be loggedin ion.log file to document a potential issue with the implementation logic. This is usedul for debugging and improving the overall SDR transaction management software.

The choice between `sdr_end_xn()` and `sdr_exit_xn()` depends on the intent of the transaction and the outcome of a transaction:

* Situation A: If a fault occurred that requires reverting SDR modifications (including those made by all nested layers) leading up to the fault event, then one should call sdr_cancel_xn to trigger SDR reversibility, if configured, and allow ion to reload the volatile protocol state to restore the SDR heap and working memory state.

* Situation B: If a fault occurred but the implementation has the proper procedure and logic to handle handling it such as:

* The fault did not result in any SDR heap changes, or
* The modifications that occurred before the fault and as part of fault handling afterward do not require reversal. For example, if the SDR modifications made before the fault may still be valid despite the occurrence of the fault, and the SDR modifications made after the fault, intentionally as part of fault handling procedure such as updating bundle error counters are successful, __AND__
* In either case, the integrity of the protocol state is not affected or can be resotred. For example, an invalid bundle was detected and can be safety removed from ION by making appropriate updates to the heap and the working memory. Then one can still call `sdr_end_xn` since the failure was handled nominally.

* Situation C: All transactions and modifications are nominal. In this case, call `sdr_end_xn.`

The implementation must be able to discern between situations A and B. When one cannot make certain that the SDR will be in operation state due to a complex transaction failure case, or due to system errors, one one should default to A and issue transaction cancellation and rely on reversibility in order to avoid leaving the SDR in an inconsistent state.
18 changes: 17 additions & 1 deletion gh-pages/docs/Known-Issues.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,9 +27,25 @@ Here is a list of known issues that will updated on a regular basis to captures

* When developing and testing ION in a docker container with root permission while mounting to ION code residing in a user's directory on the host machine, file ownership may switch from user to `root`. This sometimes leads to build and test errors when one switches back to the host's development and testing environment. Therefore, we recommend that you execute the `make clean` and `git stash` command to remove all build and testing artifacts from ION 's source directory before exiting the container.

## SDR transaction ##

### SDR transaction reversal ###

When SDR transaction is canceled due to anomaly, ION will attempt automatically try the following:

1. Reverse transaction - if it is configured - to revert modifications to the SDR's heap space which contains both user and protocol data units. This action rolls back a series of operations on the SDR's data of the cancelled the transaction.
2. Once the SDR's heap space has been restored, the "volatile" state of the protocols must be restored because they might be modified by the transaction as well. This is performed by the `ionrestart` utility.
3. After the volatiles are reloaded, the 3rd step of restoring ION operation will need to be triggered by the users. During the anomously event that caused the transaction cancellation, some of ION's various daemons may have stopped. They can be restored by simply issuing the start ('s') command through `ionadmin` and `bpadmin`.

### 'Init' Process PID 1 ###

The reloading of the volatile state and restarting of daemons is necessary to ensure the ION system is in a consistent state before resuming normal operations.

During the reloading of the volatile state, the bundle protocol schemes, inducts, and outducts are stopped by terminating the associated daemons. The restart process will wait for the daemon's to be terminated before restarting them. When running ION inside a docker container, the `init` process (PID 1) should be properly configured to reap all zombie processes because the restart process cannot proceed if a terminated daemon remains a zombie. Typically to ensure the proper `init` process, one should use the `--init` option for `docker run` command.

## Reporting Issues ##

* ION related issues can be reported to the public GitHub page for ION-DTN or ion-core.
* ION's SourceForge page is now deprecated but issues reported there will still be monitored.
* ION's SourceForge page is now deprecated and issued reported there will not be monitored.


0 comments on commit 71d3fcf

Please sign in to comment.