Fault-tolerant Stochastic Distributed Systems

Silvestre, Daniel

Please use this identifier to cite or link to this item: http://hdl.handle.net/11144/3395

Title:	Fault-tolerant Stochastic Distributed Systems
Authors:	Silvestre, Daniel
Advisor:	Silvestre, Carlos Jorge Ferreira Hespanha, Joao Pedro Cordeiro Pereira Botelho
Keywords:	Fault-tolerant Distributed Systems Networked Control Systems Set-valued Observers Event-triggered Systems Self-triggered Systems
Issue Date:	20-Dec-2017
Publisher:	Instituto Superior Técnico Instituto Superior Técnico
Abstract:	The present doctoral thesis discusses the design of fault-tolerant distributed systems, placing emphasis in addressing the case where the actions of the nodes or their interactions are stochastic. The main objective is to detect and identify faults to improve the resilience of distributed systems to crash-type faults, as well as detecting the presence of malicious nodes in pursuit of exploiting the network. The proposed analysis considers malicious agents and computational solutions to detect faults. Crash-type faults, where the affected component ceases to perform its task, are tackled in this thesis by introducing stochastic decisions in deterministic distributed algorithms. Prime importance is placed on providing guarantees and rates of convergence for the steady-state solution. The scenarios of a social network (state-dependent example) and consensus (time- dependent example) are addressed, proving convergence. The proposed algorithms are capable of dealing with packet drops, delays, medium access competition, and, in particular, nodes failing and/or losing network connectivity. The concept of Set-Valued Observers (SVOs) is used as a tool to detect faults in a worst-case scenario, i.e., when a malicious agent can select the most unfavorable sequence of communi- cations and inject a signal of arbitrary magnitude. For other types of faults, it is introduced the concept of Stochastic Set-Valued Observers (SSVOs) which produce a confidence set where the state is known to belong with at least a pre-specified probability. It is shown how, for an algorithm of consensus, it is possible to exploit the structure of the problem to reduce the computational complexity of the solution. The main result allows discarding interactions in the model that do not contribute to the produced estimates. The main drawback of using classical SVOs for fault detection is their computational burden. By resorting to a left-coprime factorization for Linear Parameter-Varying (LPV) systems, it is shown how to reduce the computational complexity. By appropriately selecting the factorization, it is possible to consider detectable systems (i.e., unobservable systems where the unobservable component is stable). Such a result plays a key role in the domain of Cyber-Physical Systems (CPSs). These techniques are complemented with Event- and Self-triggered sampling strategies that enable fewer sensor updates. Moreover, the same triggering mechanisms can be used to make decisions of when to run the SVO routine or resort to over-approximations that temporarily compromise accuracy to gain in performance but maintaining the convergence characteristics of the set-valued estimates. A less stringent requirement for network resources that is vital to guarantee the applicability of SVO-based fault detection in the domain of Networked Control Systems (NCSs).
Peer Reviewed:	yes
URI:	http://hdl.handle.net/11144/3395
Appears in Collections:	BUAL - Teses de Doutoramento DCT - Teses de Doutoramento

Files in This Item:

File	Description	Size	Format
thesis.pdf		3,34 MB	Adobe PDF	View/Open

Show full item record