Flashback 2002: "The Ilities"

Posted: March 4, 2014. Tags: Software Craftsmanship

This is a document I wrote in 2002 while leading a RAID array development team. The ideas captured here are as relevant today as they were then. In the Ruby community today, there is a very strong focus on maintainability.

Reliability Performance Usability Testability
Availability Scalability Security Maintainability
Serviceability I18N Coexistence
Diagnosability Accessibility Interoperability
Manageability Reusability

The ilities are the characteristics we are trying to imbue in the systems we create. Where possible, the standard definition of each characteristic has been listed. More importantly for our purposes however, a simple statement has been added which tries to capture what this term means in simple terms..


Definition: The probability that a product (system) performs its intended function for a specified time period when operating under normal (or stated) environmental conditions.

Translation: The system should break as infrequently as possible.


Definition: The percentage of time that a system (or product) is available to perform its functions correctly. There are many different availability measurements; Here is one: Availability = MTBF / ( MTBF + MTTR).

Translation: When a system component does break, it should not bring the entire system down, and the system as a whole should recoverand continue to operate.


Definition: The probability that a product (system) returns to its intended function within a specified period of time following a service failure.

Translation: Undividual components must be up and downgradable in place, without impacting system uptime or functionality. This translation obviously doesn't match the above definition, but it is more useful for our purposes, as the above definition can just be captured as part of availability.


Translation: When the system does break, there should be sufficient captured information and tools to enable someone to figure out what went wrong.


Translation: The system should be as fast as possible given the limitations of the underlying technology which it uses and with which it interfaces.


Translation: The performance impact of using the sysem in larger and larger configurations should be as small as possible.


Translation: It must be user friendly. Every computer system in the world should be at least as easy to use as Microsoft Windows. Preferably, it should be as simple to use as a cellphone.


Translation: The software should take measures to ensure that no one gains unapproved access to the system or it's data, appropriate for the environment in which the system will be operating. E.g. systems attached to the internet require more aggressive security measures that those in a data-center.


Translation: The software must be localizable.


Translation: If appropriate, the software must include features to make it usable by people with disabilities.


Translation: The system must be manageable, both as a stand-alone entity, and as part of a larger management infrastructure.


Translation: 'Black box' system testing alone does not cut it these days. The system must be designed to, and contain hooks and tools to make individual component testing possible.


Translation: You mustn't need Einstein's I.Q., 10 years experience or psychic powers to be able to understand, fix or enhance the system.


Definition: Capability of two or more components to exist or function in the same system or environment without mutual interference.

Translation: (a) Don't assume you are the only piece of software that is going to use a particular resource and grab it all automatically. (b) Be reentrant.


  1. Different storage devices, HBAs and OS platforms should be able to co-exist on same SAN (one of the reasons LUN masking was invented was to counteract Microsoft Windows NT's obnoxious behavior as it tries to grab any disks it sees).
  2. Device drivers must be support multiple instances of the same device.
  3. Different volume managers (VxVM, SLVM) should be able to co-exist in same host.
  4. Different multi-pathing solutions (VxDMP, AP, MPXIO) should be able to co-exist in same host.


Definition: The ability of a system or product to work with other systems or products without special effort on the part of the customer.

Translation: Ensuring compliance with existing standards, specifications and interfaces during product design/architecture, so that newly developed components interoperate with and are compatible with other existing components and with new components to come in future.


Translation: Where possible, system components should be designed so that they are reusable and may be utilized simply in other projects.