Revised 3.5.1

This commit is contained in:
Jonatan Pålsson 2011-05-12 22:11:00 +02:00
parent e099c26937
commit a5ca92a170

View file

@ -5950,7 +5950,7 @@ name "sub:Supervisor-structure"
\begin_inset Float figure
wide false
sideways false
status collapsed
status open
\begin_layout Plain Layout
\begin_inset Note Note
@ -6011,6 +6011,12 @@ end{centering}
\begin_inset Caption
\begin_layout Plain Layout
\begin_inset CommandInset label
LatexCommand label
name "fig:The-supervisor-structure"
\end_inset
The supervisor structure of GGS
\end_layout
@ -6038,6 +6044,8 @@ key "Savor:1997:HSA:851010.856089"
.
When a process misbehaves, the supervisor takes some action to restore
the process to a functional state.
In the case of the GGS, a process misbehaving most commonly triggers a
restart of the faulting process.
\end_layout
@ -6048,18 +6056,18 @@ There are several approaches to supervisor design in general (when not just
process(es) it supervises, and let the supervisor make decisions based
on this state.
The supervisor has a specification of how the process it supervises should
function, and this is how it makes decisions.
function, this is how it makes decisions.
\end_layout
\begin_layout Standard
In Erlang, we have a simple version of supervisors.
We do not inspect the state of the processes being supervised.
We do have a specification of how the supervised processes should behave,
but on a higher level.
In Erlang, there is a simple version of supervisors.
No state of the processes being supervised is inspected.
There is, however a specification of how the supervised processes should
behave, but on a higher level.
The specification describes things such as how many times in a given time
interval a child process may crash, which processes need restarting when
crashes occur, and so forth.
crashes occur, etc.
\end_layout
@ -6073,8 +6081,9 @@ When the linking of processes in order to monitor exit behavior is coupled
\end_layout
\begin_layout Standard
In the GGS, we have separated the system in to two large supervised parts.
We try to restart a crashing child separately, if this fails too many
In the GGS, the system has been separated into two large supervised parts.
An attempt to restart a crashing child separately is made, if this fails
too many
\begin_inset Foot
status collapsed
@ -6093,26 +6102,33 @@ too many
\end_inset
times, we restart the nearest supervisor of this child.
times, the nearest supervisor of this child is restarted.
This ensures separation of the subsystems so that a crash is as isolated
as possible.
\end_layout
\begin_layout Standard
The graphic above shows our two subsystems, the coordinator subsystem and
the dispatcher subsystem.
Figure
\begin_inset CommandInset ref
LatexCommand vref
reference "fig:The-supervisor-structure"
\end_inset
shows our two subsystems, the coordinator subsystem and the dispatcher
subsystem.
Since these two systems perform very different tasks they have been separated.
Each subsystem has one worker process, the coordinator or the dispatcher.
The worker process keeps a state which should not be lost upon a crash.
\end_layout
\begin_layout Standard
We have chosen to let faulty processes crash very easily when they receive
bad data, or something unexpected happens.
A choice has been made to let faulty processes crash very easily when they
receive bad data, or something unexpected happens.
The alternative to crashing would have been to try and fix this faulty
data, or to foresee the unexpected events.
We chose not to do this because it is so simple to monitor and restart
processes, and so difficult to try and mend broken states.
This was not chosen since it is so simple to monitor and restart processes,
and so difficult to try and mend broken states.
This approach is something widely deployed in the Erlang world, and developers
are often encouraged to “Let it crash”.
\end_layout
@ -6120,9 +6136,9 @@ We have chosen to let faulty processes crash very easily when they receive
\begin_layout Standard
To prevent any data loss, the good state of the worker processes is stored
in their respective backup processes.
When a worker process (re)starts, it asks the backup process for any previous
state, if there is any that state is loaded in to the worker and it proceeds
where it left off.
When a worker process (re)starts, the backup process is queried for any
previous state, if there is any, that state is loaded in to the worker
and it proceeds where it left off.
If on the other hand no state is available, a special message is delivered
instead, making the worker create a new state, this is what happens when
the workers are first created.