%! TEX root = main.tex \input{preamble.tex} \begin{document} \maketitle \thispagestyle{empty} \newpage The following sections will concern the fourth test, \enquote{Supervisor Controller Dropout}. \section{Test Case} \label{sec:test_case} The purpose of this was to observe the system response to one of the failure modes to which we set out to become (more) resilient. As written in the group report, all of the controller units, as well as the supervisor, was activated during the test. Wind production was also active. As such, all units were under observation, however specifically the system daemon was the object under investigation, as this was our way of implementing a system response to failure in the form of loss of a controller. The system daemon should be passive during normal modes of operation, only receiving supervisor heartbeats, and should only kick into action when failure occurs. Our use case, as specified in the group report, concerns the hybrid power plant as a means of frequency regulation with compensation for the fluctuations normally associated with renewable energy production. Therefore, the system should not be disruptive in case of a single controller breaking down; if that were the case, a single operator mistake could be rather costly to the rest of the grid participants. Because of these concerns, the main metrics with which the outcome is measured is the $ RMSE $ from the expected frequency response during a supervisor controller dropout, as well as a more qualitative assessment of the system reaction time to a fault. \section{Test Specification} \label{sec:test_spec} This test begins with the system in a normal operating state. The individual unit controllers, as well as the supervisor controller, are started. The supervisor controller should then be given some amount of time (in this case 15-20 seconds) to calculate a PCC baseline based on the baseline load combined with median PV production. Once this has been established, regular frequency data should be published to the controller. After the supervisor has had a chance to behave somewhat normally - in our run, 10 seconds after baseline establishing, the controller process is interrupted. The unit controllers are not directly investigated during this step, however their reaction to the lack of frequency data and split points should be apparent in the general system response. To stress the system further, the battery controller is also dropped after the supervisor has been ressurrected. The output of the test, from which the $ RMSE $ is calculated, is the deviance in the FCR activation from the FCR setpoint, in the span of the test. \section{Test Results} \label{sec:test_results} \begin{figure}[H] \centering \includegraphics[width=0.8\textwidth]{img/dropout.png} \caption{Results from the dropout test} \label{fig:results} \end{figure} The above \cref{fig:results} shows the results from the test. The two dotted, horizontal, black lines specify the times at which controllers were dropped; first the supervisor, then the battery controller. The dotted red line is when the system daemon notices the absence of the supervisor, and revives it by spawning a new process. From the $ RMSE $ value, it seems that the system responds rather well to the supervisor being killed. Further inspection shows that the system shows signs of instability in the period immediately following the controllers being dropped, as should be expected, since it takes some time for the other components to mark the supervisor as deceased. However, it seems that the system stabilizes after a while with only minor fluctuations which could be from the wind turbine, since that is out of our control. More results can be found in our GitLab repository. In the subfolder \texttt{testing/final\_tests\_and\_results/dropout/console\_logs} the complete logs from the three unit controllers, the supervisor, and the system daemon. It should be noted that when the supervisor is killed, the console output ends; this is since the system daemon, after a set timeout, revives the supervisor by spawning a new subprocess. This subprocess will then be a child of the system daemon, which itself is a child of a specific terminal. Therefore, the console output of the newly spawned supervisor is printed to the terminal containing the system daemon. This is the same for the battery controller. During this test, neither the PV- or load controllers were touched, so these have continuing output all throughout the test. \section{Discussion and Outlook} \label{sec:discussion} To conclude our test of controller dropout, the system seems to handle a controller dropout reasonably well. This failure mode is a last-ditch effort in case of absolute emergency; only in extreme circumstances would it be expected for a controller to drop out spontaneously. It is, however, rather comforting to know that should it happen, the system seems to stabilize after a short while. There is, however, room for improvement. The controllers seem to react violently in the first phases of a supervisor dropout. This may not be able to be mitigated, however it could be an area of interest. As was also apparent from our other tests, the controllers seem to behave in a competing way, sometimes over-shadowing the other unit controllers. By phasing in controller response following e.g. a sigmoid curve, the system response could perhaps be modified to be a more gentle, gradual reaction. It should also be noted that this way of spawning new processes only works by spawning said child process on the machine running the system daemon. This detail could of course be changed by changing the mechanism of spawning a controller, however it is a definite limitation of the current design. Due to time constraints, it was not further developed, as the principle of reviving controllers was more important than that controller running on a specific machine. A next test, which could yield interesting results and conclusions, could be to observe how the system were to react in case of temporary controller dropouts, where the dropped controller came back to life. In this case, the system daemon would probably respond by creating another controller, which would then compete with the old one, however the exact result cannot be recognized without running the test. It could also be interesting to see how the system would act in case of these dropouts happening more frequently, or during periods of frequency volatility. \end{document}