by Garfield Dean, EUROCONTROL
Open Source is based on public review of source code. But before code can be written to meet standards, the relevant standards must first exist. To be useful, a Traffic Alert and Collision Avoidance System (TCAS) must interoperate with other units.
H. Lueders: A glossary would be useful. >Hugo>
G. Dean: Most of the things will be explained, even if one of the slides is in Japanese. >Garfield>
This presentation discusses the similarities between the open source process, and a technical standardization process.
TCAS is equipment on board aircraft that tells the pilot to climb or descend to avoid collision with other aircraft.
F. Gasperoni: Is it the one that was overridden by a controller in Switzerland and caused a crash? >Franco>
G. Dean: Absolutely! >Garfield>
J-D. Frayssinoux: The TCAS was right! >Jean-Dominique>
F. Gasperoni: Yes, the human was wrong! >Franco>
G. Dean: It depends where you limit the system. Do you include the controller and pilot in the system or not? The smaller technical system was right. The larger social system was not correct. >Garfield>
The TCAS uses a type of radar, called a transponder, which measures distance with other aircraft and gets altitude information. Based solely on that information, it evaluates collision detection algorithms each second to determine if the aircraft is in danger of colliding with another. The algorithms used are completely defined by pseudo-code and completely identical state charts. Correspondence between the state charts and the pseudo-code has been explicitly and carefully verified.
These pseudo-code and state charts have some similarities with OSS.
The TCAS concept dates back to the 1950s, but it wasn't until the '70s that a real impetus to the program existed, in the form of a US Congressional mandate to build such a system, following a high-profile aviation disaster. Since TCAS systems were to be placed on all aircraft coming into US airspace, the scope of the project went worldwide.
Therefore, the FAA had to develop international standards from the onset. This required cooperation from people all over the world, and they had to get a group of manufacturers, ideally from all over the world as well, to implement the set of specifications which had been defined.
These specifications, these standards, are vital to the system. If one system says "Climb!" and the other aircraft's system says "Climb!" as well, the collision will not be avoided. Coordination and agreed sets of rules must be established. A high-level set of performance standards was created by the International Civil Aviation Organization (ICAO), basically defining the percentage of collisions the system will be able to resolve, acceptable false alerts rates, and interoperability considerations. Displays were not defined, nor many practical details of implementation.
ICAO is a large organization, and the design required committee approval needing consensus, with delegates from organizations in many Member States. In practice, a handful Sates were really interested, and a couple of organisations like the pilots' and controllers' unions and EUROCONTROL. Progress was excruciatingly slow, and the committee was not the driver of the process. Momentum for the standards came from the development work in the field, but the committee provided a forum for discussion. It was a check and balance, using input from underlying validation work.
RTCA (Radio and Technical Committee of Aeronautics) did the detail work, including the display descriptions and architectural design. There are a lot of explanatory texts, but the heart of the specification is a hierarchy of detailed state charts and pseudo-code, which manufacturers use to write their actual code in C or Ada.
F. Gasperoni: Is it the one that was overridden by a controller in Switzerland and caused a crash? >Franco>
G. Dean: Absolutely! A detailed series of tests are used to determine compliance by the manufacturer's code to the specification. >Garfield>
In the RTCA, the detailed design was committee-based again needing consensus. The RTCA is quite political, very strongly influenced by the FAA. The chairman is under a lot of pressure from different areas: airlines, pilots, controllers, and manufacturers. In theory, anyone is welcome on the committee, but they must show their competence. There are many different viewpoints: some participants didn't even want the system to exist.
Initially, the system was fluid, if changes were good, they'd rapidly be incorporated into the spec. Inputs came from many sources. The FAA fielded teams to develop alternate representations (the state charts), European teams validated algorithms with European data, and many teams were involved in different areas. The RTCA decision-making process wasn't very transparent, and some participants had more clout than others.
Once the system was implemented, however, incorporating changes became much more difficult. Making certification changes and sending technicians to the field costs quite a bit. A single software change, just downloaded to the devices' firmware, would cost an airline around $ 5,000 per aircraft.
F. Gasperoni: That's actually quite cheap. >Franco>
G. Dean: If you consider that they are 10,000 aircraft in and around Europe that requires this software, the cost of a single small software change can reach fifty million dollars. >Garfield>
How does this development compare to OSS? First of all, pseudo-code is not really code. There are at least 5 different implementations of the same pseudo-code. A corollary of this is that stringent testing standards are required to assure proper function and interoperability.
Another difference with many OSS projects was that the individuals participating in this process were getting paid. The various organizations sending delegates had to cover these costs, of course, but the individuals involved received monetary compensation for their efforts. The process was fairly open. Those who made the effort to be heard, could be heard.
Although TCAS is a safety system, it is not a safety-critical system. The question remains, would an OSS version of a safety-critical system be acceptable?
F. Gasperoni: Is this level "E"? >Franco>
G. Dean: No it isn't. It's classified as being level "B", but it's arguable that it's only need to be level "C". If your TCAS system continually generates false alerts, then it's a hazard. But if it falls over crashes and does nothing, you are just back to the safety level where you were right before; it hasn't induced any new risk. If you coupled it directly to auto-pilot, it might be something else... >Garfield>
Responsibility for the TCAS units lies with the manufacturers. However, the FAA absolves manufacturers of responsibility for design logic flaws; moreover the FAA cannot be sued. The recent accident in Überlingen may actually test this in court.
Suppose there really was a flaw in the TCAS logic (and we know that, in a very small percentage of cases, TCAS will create a collision that was not there previously, in the same way that a very small percentage of people inoculated against a disease will have an adverse reaction and die as a result of the inoculation), who will take the responsibility? The same is true for any form of safety critical software.
P. Kappertz: Was the fault in the algorithm known before? >Peter>
G. Dean: It's inherent within the system. Very briefly the argument is something like this: without TCAS, you would have 5 mid-air collisions in a period of time and an airspace. With TCAS, you would have 2 collisions left. However, one of those 2 collisions has been created by TCAS. You have still considerably improved the overall safety, but one of the collisions has been induced by the system. >Garfield>
P. Johnson: Presumably, the Uberlingen accident falls in that category. >Paul>
G. Dean: Absolutely! You can still argue that there was a weakness in the logic, I would call it a flaw, in that... >Garfield>
In comparison with OSS, you have to pay a fee to access the RTCA documents. While the fee is not more than a few hundred dollars, trivial for an organization, it is significant for individuals. This money is used for funding the RTCA, so it raises the question of how to fund the people responsible for establishing the standard.
Feedback from users experience has been incorporated into upgrades of the system. Since the initial release, there have been 2 upgrades, and there could be another one.
Individual manufacturers add their own features, much like Red Hat or Novel do with Linux. However, these additions are not fed back to the RTCA; they are the manufacturer's selling points.
Some of the questions distributed to all of the authors are worth considering here:
Finally, we can speculate where OSS could be successfully introduced in ATM. Perhaps we should start using OSS only in non safety-critical systems? At least initially, that could prove easier to do. Clear lines of responsibility are required. This does not mean that safety critical software should not be open to review. Who will be in and run the steering committee?
Initial suggestions could easily include safety and performance analysis tools, real-time and fast-time simulators. These types of tools are not safety-critical, and would be useful and widely required by the community. Why shouldn't we have a world-wide community producing professional simulators? Amateur-built flight simulators have been developed by communities, so why not pitch in and use that work in professional simulators?
Why not having an open source ESCAPE or an open source RAMS? These are our in-house proprietary simulators at EUROCONTROL.
from 29'15" to 36'48" (07'33")
H. Lueders: How do you achieve TCAS interoperability? >Hugo>
G. Dean: Through specific standards. Normally one airplane reacts before the other and communicates its intentions. The first aircraft that sends a message is the one that will lock things in. In the very rare occasion where there's a synchronous choice of resolution advisories, then the aircraft with the lowest Mode-S address is given priority and forces the other one to reverse its choice (each aircraft in the world has a unique Mode-S address). >Garfield>
P. Johnson: So the RTCA process obviously does work generally... >Paul>
G. Dean: That sort of standard setting process is somewhat useful in this context. I'm sure that ATM regulators are going to require standards to be set, but the question is how to make that process more open to the man on the street. >Garfield>
P. Johnson: The ITF has a very practical model. >Paul>
O. Robert: There's a big difference between the ITF and the ICAO or the RTCA. The last two organizations are trying to do things before there is any implementation. The ITF tries to validate implementations, because to have an internet standard, you have to have two different interoperable implementations. It's completely the opposite process. >Ollivier>
G. Dean: In practice, the implementation and the standards go hand in hand. Manufacturers start to develop the kit and put in some provisional algorithms to see how things were working and in the meantime people were developing alternative algorithms. >Garfield>
R. Schreiner: I think it is more the OMG (Object Management Group) style. >Rudolf>
P. Kappertz: It not so clear why this approach is called "open-source" software. It looks like a perfect example in which you really can define standards that are applied, work, and are interoperable. >Peter>
G. Dean: The standards are at an exceptionally low level, they are really getting down to the engineering details . It interesting to ask "To what extent can you push the standards down to the details?" "Can you set an open source as a standard?" That's perhaps the challenge. >Garfield>
M. Bourgois: On a social point of view, there's a strong similarity with the process of open source. >Marc>
J. Feller: Open Source projects tends to flourish when they are clear standards. Take for example the W3C, or the Apache Foundation's programs like Tomcat or Jakarta. >Jo>
Another thing I want to say is about your question of whether OSS is appropriate for safety-critical systems. The answer is probably the same if you ask the question whether proprietary software is appropriate for safety-critical systems. You have criteria in mind when evaluating proprietary systems, and they are no more or less stringent for OSS.
G. Dean: Fair comment! >Garfield>
(*) In a private communication, in reply to a question of Garfield Dean, Brian Fitzgerald explained: «'Security Symmetry' is a reference to Ross Andersen's conjecture (discussed in Perspectives on Free and Open Source Software, The MIT Press, Cambridge, USA, June 2005) which proposes that open systems may be more prone to security attacks (because 'evil' crackers can see the code) but this is balanced by more opportunity to identify and fix potential security flaws in the first place (because 'good' hackers can also see the code). Breaking the security symmetry would be trying to shift the balance more towards realising the benefits, at the expense of incurring the risks.»