How will you decide a system is fault tolerant or not. I have made it simple. If the below characteristics a system met then we can say it as fault tolerant system.
The hardware and software are constructed of modules of fine granularity. These modules constitute units of failure, diagnosis, service, and repair. Keeping the modules as decoupled as possible reduces the probability that a fault in one module will affect the operation of another.
A fail-fast module either works properly or stops. Thus, each module is self-checking and stops upon detecting a failure. Hardware checks (through error-detecting codes; and software consistency tests support fail-fast operation.
Single Failure Tolerance
When a single module (hardware or software) fails, another module immediately takes over. For processors, this means that a second processor is available. For storage modules, it means that the module and the path to it are duplicated.
Hardware and software modules can be diagnosed, disconnected for repair and then reconnected, without disrupting the entire system’s operation.