Verifying 50,000 lines of C code
WHAT DOES “FORMAL VERIFICATION” MEAN?
It's possible that before too long, software products will be labelled ‘quality verified'. That's because increasingly complex processor architectures, diverse machine and software configurations, and the implications of parallel software, call for a new methodology to prove that software is of high-quality and matches its specification.
Current practice in software development focuses on ‘testing'. A good analogy is with building a house, where ‘testing' compares how the building looks with an elevation in the blue print, and whether high-level functionalities such as opening the doors match the high-level design of the house. Sometimes that's good enough, but increasingly it isn't - because the demands of complex software and end-user requirements are increasing as well.
However, there's a glimpse of hope, as a methodology called ‘verification' is adopted more and more in software development. “With increasingly complex software, we need a formal and automated methodology that allows operating system developers to do the necessary verification,” says Stephan Tobies, a software development engineer who leads the development of verification tools at the European Microsoft Innovation Center in Aachen, Germany. “Bugs get harder to find. By using a mathematical approach, we can prove the absence of certain bugs.” Again, using the analogy of building a house, it means checking that what has been built matches exactly with what is in the detailed building plan, including all the plumbing, electrical wires and reinforcements that are not directly visible.
For the development of software, this idea means that if there is a precise specification for software, it is possible to check with logical methods whether the software code fulfils the specification. Every single step of the software code is broken down into logical expressions and is checked to confirm that conditions described in the specification are met.
This approach to proof of correctness allows for much more accuracy compared with simple tests which only check examples of the system's behaviour. However, this kind of proof is hard to automate. Thus, computational proof methods are primarily used today for mission-critical systems such as aircraft electronics, and the impact on the overall software industry has so far been limited.
ACADEMIA AND INDUSTRY JOINING FORCES TO WORK ON REAL-WORLD SOFTWARE PRODUCTS
Microsoft and the University of Saarland in Saarbrücken, Germany, have joined forces in a programme supported by the German government to develop formal verification of software to a new level and to foster the role of this methodology in the software industry. The project VERISOFT and its successor VERISOFT XT aims at the creation of methods and tools for automated formal verification of software.
The collaboration of Microsoft's verification research groups in Redmond, USA, and at the European Microsoft Innovation Center (EMIC) in Aachen, Germany, with Wolfgang Paul, professor for computer architecture in Saarbrücken, Germany, who is the scientific project lead for VERISOFT XT, fascinates both sides.
Professor Paul had already used mathematical theory to verify small operating system kernels with 5,000 lines of code in the project VERISOFT. For their collaboration, the partners chose a real-world product, the Windows Server 2008 ‘Hyper-V', to be verified – software which acts as a super-operating-system that can run different operating systems on one computer [see accompanying box]. With Microsoft on board, the research project was now tasked with verifying ten times the amount of code, about 50,000 lines, something Professor Paul says hasn't been done before – hence his view that this collaboration is a key milestone.
This second phase of the VERISOFT project, called VERISOFT XT, started last year, with both parties, Microsoft and the University team in Saarbrücken, keen to combine their knowledge and tools. Professor Paul remembers jumping at the chance to apply his research on validating complex computer software to a reallife operating system. “It was like a childhood dream, to work with Microsoft Research and EMIC on the Hyper-V with the aim of removing all programming errors,” says Professor Paul.
“Our task at EMIC is to collaborate with academia for mutual benefit. This project is one of the few exceptions where verification technology is being applied to a real product. This is a major step forward for the whole field,” says Thomas Santen, a software development engineer who leads the verification group at EMIC.
THE WAY FORWARD – LOT TO BE DONE, AND LOTS OF OPPORTUNITIES
The verification of hardware has advanced greatly during recent years, but software verification has to catch up, and will do so rapidly. Professor Paul emphasises the enormous potential of this approach, given that much of the theory in computer science and mathematics has already been developed and is ready to be applied: “This is not technology in its infancy. We are shooting with all that we have; all theories will be applied.” At the same time, today's mathematical models are lagging decades behind the state-of-the-art of current computers and software. “The textbook theory isn't developed enough to deal with multiple processors running in parallel and complicated memory systems,” Professor Paul says. “We have to close this gap.” Also, the complexity of tackling the hypervisor's code running on highend parallel processors, such as those from Intel and AMD, is creating the need for more efficient tools to handle the sheer amount of code.
According to Stephan Tobies, “Being able to verify the Hyper-V is still an order of magnitude away from the verification of a full operating system. The work could be extended to verify certain parts of the Windows operating system in the future, such as the memory manager of the Windows operating system, which, by itself, is already more complex than the complete hypervisor.”
“Eventually it may be possible to verify complicated application programs as well,” says Professor Paul, who adds that the Verisoft XT project aims to establish a new software industry.
In the future, we may see ‘correct' software in the sense that it fulfils its specification. But there may still be ‘misbehaviours' of software, in situations where the requirements of the software or the conditions of its use for the environment or the users are not adequately described, such that the specification does not match reality. But we are nearly there …
WHAT IS THE HYPER-V?
Hyper-V is a hypervisor, which serves as a virtualisation feature to enable multiple guest operating systems to run at once on the same machine. Hypervisors first emerged in the mainframe world, but they are having a renaissance in main stream computing because advances in hardware capabilities leave many powerful servers idling most of the time – time that can more beneficially be spent serving other tasks for other users.
Running multiple operating systems allows the consolidation of servers and getting more work from a single system, reducing costs through less hardware, energy and management overhead. Hypervisors also help to improve the computer system's security and reliability by providing isolation between its guests. This ensures that a crash or security breach of one guest does not compromise the other guests although they all share the same hardware.
“Most of the time the server isn't fully booked, so it makes sense to run different tasks on the same server at the same time,” explains EMIC's Thomas Santen. “But you have to make sure the different tasks don't interfere with each other.” When multiple operating systems access the same memory at once, it is the Hypervisor's task to ensure that these accesses do not interfere with each other.
Hyper-V is in beta test. It will be sold both as part of Microsoft Server 2008, which started shipping in early March, and as a stand-alone product starting this summer.