3-7 December 2017
Velmoré Hotel Estate
Africa/Johannesburg timezone

Scientific Modeling of Storage System Reliability

4 Dec 2017, 11:00
25m
Rendezvous (Velmoré Hotel Estate)

Rendezvous

Velmoré Hotel Estate

96 Main Road (M26) Hennops River Erasmia
Talk Storage and I/O Storage and I/O

Speaker

Dr Matthew Curry (Sandia National Laboratories)

Description

When designing large scale storage systems, failure is always a serious concern that demands constant attention. However, the ability for system designers to objectively evaluate their risk of data loss for a given storage system is minimal. Instead, they must enumerate possible failure modes, estimate their relative probability, identify possible mitigations, and decide whether the expense is worthwhile. This process relies on folk wisdom, rules of thumb, and anecdotal experience. For systems that grow larger and more complex year-by-year, this methodology is too imprecise to guarantee safety while ensuring efficiency. This talk will detail some of the progress in the SIMS^2 project, a collaboration between Sandia National Laboratories, University of Wisconsin-Madison, and Los Alamos National Laboratory charged with increasing the science and rigor behind evaluating system designs. It will cover some of the pitfalls of current methods of evaluating systems, methods for determining complex behavior of aggregated components, evaluation of different types of failure modes, and some interesting inflection points for different system designs.

HPC content

Contained in Abstract

Primary author

Dr Matthew Curry (Sandia National Laboratories)

Presentation Materials

There are no materials yet.