Program synthesis (SURVEY UNDER CONSTRUCTION)

We use the term program synthesis to refer to the development of a program or a digital design from a set of desired properties. These properties can be described in various ways, e.g., by a set of logical formulas or a mathematical model, by another program that is written in a more abstract programming language. Synthesis is the counterpart to program analysis, which tries to extract a set of properties or a model from a given program or digital design, or program verification, which tries to establish a relationship between a set of properties and the given program or circuit. The set of desired properties are usually called the specification of the system and the constructed system is referred as the implementation of the specification.

Techniques related to synthesis:

  • Compilers (e.g., Java to JVM, C to Assembly,…)
  • Code generators for model-based design (e.g., Matlab Simulink, Esterel,..)
  • High-level hardware synthesis (e.g., C to Verilog)
  • Logic synthesis (RTL to gate level, net list to layout,…)
  • Synthesis from logical specifications
  • Model-based design and other domain specific synthesizer:
    • parser generators (YACC)
    • generator for statistical analyzers (AutoBayes Synthesis System, Johann M. Schumann, NASA - Moffett Field)
    • Matlab Simulink from MathWorks (based on a synchronous dataflow language, generates C/C++ code)
    • SCADE Suite from Esterel Technologies (Systems are described using a synchronous language, generates C/C++ code)
    • Bluespec compiler (HDL based on rules, each rule is executed automatically, designs have no explicit clock, generates Verilog code)
    • BIP compiler (Systems are describe using extended automata and synchronization constructs, generates C/C++ code)

In the following we will focus on synthesis from logical specifications.

Program synthesis from logical specification

Motivation. A solution to this problem would allow augmenting the current programming style, which is mostly imperative, with a declarative, logical style. This is particularly beneficial for systems in which concurrency plays an important role.

In this area approaches can be split into three groups:

  1. Logic and constraint programming (Prolog-like programming languages with or without additional constraints)
  2. Logic programming or logic program synthesis (interactive program synthesis based on theorem provers)
  3. Specialized algorithmic program synthesis (algorithmic program synthesis based on decision procedures)

Logic programming, in the narrower sense in which it is more commonly understood, refers to Prolog-like programming languages, which uses logic as both a declarative and procedural representation language. Logic programming is based on the fact that the way in which backward-reasoning theorem-provers handle an implication A1 ∧ … ∧ An → G can be viewed as a program that shows G by showing A1,…, and An first. This way declarative programs are executed following a fixed strategy, and only those solutions computable by this strategy are found. This is in contrast to algorithmic program synthesis techniques that are guaranteed to find all solutions.

Constraint logic programming is a form of constraint programming, in which logic programming is extended to include concepts from constraint satisfaction.

Logic programming, in the broader sense, or sometimes called logic program synthesis refers to the systematic development of programs based on guided theory-proofing techniques. It starts from specification in first or higher order logic and uses a semi-automated way of refinement to derive the program. Refinement steps are initiated by user providing a tactic or a design principle (e.g., divide and conquer principle). Then, the underlying theorem prover automatically perform the refinement step.

Conferences

  • LOPSTR – International Symposium on Logic-Based Program Synthesis and Transformation (BJ personal note: Sumit Gulwani is invited speaker on PPDP 2010)
  • PPDP – International ACM SIGPLAN Symposium Principles and Practice of Declarative Programming

Surveys

  • Pierre Flener, Survey on Logic Program Synthesis, 2002
  • Yves Deville, Logic Programming: Systematic Program Development, 1990

Prominent Systems

  • Specware – an advanced software development system developed at Kestrel Institute. Specware is based on technology developed in Douglas Smith's KIDS system, which aims to mechanize the development of software from formal specifications. KIDS has been used to synthesize many algorithms, including a variety of complex high-performance scheduling applications for the US Air Force.
  • … (add more)

Algorithmic program synthesis refers to a set of automated synthesis techniques that are based on decision procedures. Each of them targets a particular class of specifications and generates a program within a predefined set of programs. For instance, two recently developed decision procedures1) 2) for realizability of Linear-time Temporal Logic (LTL) have led to the development of new tools 3) 4) 5) 6) 7) 8) 9) and algorithms10) 11) 12), that synthesize reactive finite-state programs from LTL specifications. Some of these tools are able to synthesize small but realistic industrial hardware modules from their logical specifications13) 14) 15). Other techniques were designed to simplify specific programming tasks by adding new programming constructs, e.g., synchronization mechanism for concurrent data structures16) 17) deadlock avoidance constructs 18) 19) or allowing implicit computation of values and advanced pattern matching20). Synthesis techniques are also used to correct 21) or optimize22) existing programs and to build programs from a fixed set of library components or program statements 23) 24) 25) 26) 27).

Like in program verification, there are two main directions in algorithmic program synthesis: control-oriented and data-oriented synthesis techniques. Control-oriented techniques (e.g., used to construct digital designs or synchronization in concurrent programs) assume finite-state data structures (or finite-state abstractions) but can generate interactive (non-terminating) programs (called reactive systems). Input specifications describes the desired sequences of behaviors using a temporal logic (e.g., LTL). These techniques are based on automata theory on infinite words and automata-based game theory. Data-oriented techniques assume a bounded length of the program but can cope with infinite-state data structures or abstractions. They start from an input-output relation given by a formula in a realizable logic, e.g., integer arithmetic or propositional logic, and construct source code satisfying the specification. In some cases, the class of generated program is restricted by the set of available statements or an initial incomplete program.

Approaches

Application Input Output Main Techniques Restrictions Contact person
Full program construction LTL FSM (dot) CoBuchi-tree automata, Anti-chains Spec size, finite-state systems J.F. Raskin, ULB
Full program construction LTL-subset (Streett-1) Circuits (Verilog) GR1-games, BDDs Finite-state systems R. Bloem, TU Graz
Full program construction LTL-subset (Streett-1) FSM (?) GR1-games, BDDs Finite-state systems Y. Saar, Weizman Institute
Program Repair LTL-subset (LTL-det) Programs (Verilog) Buchi-games, BDDs Finite-state systems B. Jobstmann, Verimag
Full program construction FOL + tactics Program (?) Theorem proofing requires interaction D. Smith
Synchronization mechanism E. Yahav
S. Gulwani
S. Chandra
V. Kuncak
Program Sketching R. Bodik
Deadlock avoidance mechanism Y. Wang
B. Finkbeiner

Dagstuhl Talks

Program Sketching Rastislav Bodik, UC Berkeley
Satish Chandra, IBM TJ Watson Research Center
Synthesizing Verifiers for Synthesized Code Ewen W. Denney, NASA
CEGAR for Synthesis Bernd Finkbeiner, Universität des Saarlandes
Dimensions in Program Synthesis Sumit Gulwani, Microsoft Research
Component based Synthesis Sumit Gulwani, Microsoft Research
? Shachar Itzhaky, Tel Aviv University
Synthesis of Communicating Automata from MSCs Joost-Pieter Katoen, RWTH Aachen
Efficient Synthesis of Asynchronous Systems Uri Klein, Courant Institute - New York
Synthesis Procedures Viktor Kuncak, EPFL
Synthesis as Learning Automata from Examples Martin Leucker, TU München
Systematic Program Design: From Clarity to Efficiency Yanhong Annie Liu, SUNY - Stony Brook
Synthesizing programs using model checking based genetic programming Doron A. Peled, Bar-Ilan University
Synchronous Reactive Synthesis From Linear Temporal Logic Specifications Nir Piterman, Imperial College London
Synthesizing Hardware from Sketches Andreas Raabe, TU München
Software Synthesis is Hard – and Simple Sven Schewe, University of Liverpool
Data Mining of Air Traffic Track Data for NGATS with the AutoBayes Synthesis System Johann M. Schumann, NASA
Mechanized Algorithm Design Douglas R. Smith, Kestrel Institute
New directions in Computer Supported Programming Armando Solar-Lezama, MIT
Program Synthesis using SMT Solvers Saurabh Srivastava, University of Maryland
Synthesis of Concurrent Algorithms Martin T. Vechev, IBM TJ Watson Research Center
Recursive Plans and Imperative Programs Revisited: Deductive Synthesis Richard Waldinger, SRI
Abstraction-Guided Synthesis of Synchronization Eran Yahav, IBM TJ Watson Research Center
Inferring Synchronization under Limited Observability Greta Yorsh, IBM TJ Watson Research Center
1) Orna Kupferman and Moshe Vardi, Safraless Decision Procedures in FOCS, 2005
2) Nir Piterman, Amir Pnueli and Yaniv Saar, Synthesis of Reactive(1) Designs in VMCAI, 2006
9) Synthesis extension in VIS
10) Orna Kupferman, Nir Piterman, and Moshe Vardi, Safraless compositional synthesis in CAV, 2006
11) Sven Schewe, Bounded Synthesis in Automated Technology for Verification and Analysis, 2007
12) Saqib Sohail, Fabio Somenzi, and Kavita Ravi, A Hybrid Algorithm for LTL Games in VMCAI, 2008
13) Roderick Bloem, Stefan Galler, Barbara Jobstmann, Nir Piterman, Amir Pnueli, and Martin Weiglhofer, Automatic Hardware Synthesis from Specifications: A Case Study in DATE, 2007
14) Roderick Bloem, Stefan Galler, Barbara Jobstmann, Nir Piterman, Amir Pnueli, and Martin Weiglhofer, Specify, Compile, Run: Hardware from PSL in COCV, 2007
15) Yashdeep Godhal, Krishnendu Chatterjee, and Thomas A. Henzinger, Synthesis of AMBA AHB from Formal Specification in CoRR, 2010
16) Martin T. Vechev, Eran Yahav, and Greta Yorsh, Inferring Synchronization under Limited Observability in TACAS, 2009
17) Armando Solar-Lezama, Christopher Grant Jones, and Rastislav Bodik, Sketching concurrent data structures in PLDI, 2008
18) Yin Wang, Terence Kelly, Manjunath Kudlur, Stephane Lafortune, and Scott A. Mahlke, Gadara: Dynamic Deadlock Avoidance for Multithreaded Programs, OSDI, 2008
19) Yin Wang, Stephane Lafortune, Terence Kelly, Manjunath Kudlur, and Scott A. Mahlke, The theory of deadlock avoidance via discrete control in POPL 2009
20) Viktor Kuncak, Mikael Mayer, Ruzica Piskac, and Philippe Suter, Complete Functional Synthesis, PLDI, 2010
21) Barbara Jobstmann, Andreas Griesmayer, and Roderick Bloem, Program Repair as a Game in CAV, 2005
22) Armando Solar-Lezama, Rodric M. Rabbah, Rastislav Bodik, and Kemal Ebcioglu, Programming by sketching for bit-streaming programs in PLDI, 2005
23) Armando Solar-Lezama, Liviu Tancau, Rastislav Bodik, Sanjit A. Seshia and Vijay A. Saraswat, Combinatorial sketching for finite programs, ASPLOS, 2006
24) Armando Solar-Lezama, Gilad Arnold, Liviu Tancau, Rastislav Bodik, Vijay A. Saraswat, and Sanjit A. Seshia, Sketching stencils in PLDI, 2007
25) Armando Solar-Lezama, The Sketching Approach to Program Synthesis in APLAS, 2009
26) Saurabh Srivastava, Sumit Gulwani, and Jeffrey S. Foster, From program verification to program synthesis in POPL, 2010
27) Susmit Jha, Sumit Gulwani, Sanjit A. Seshia, and Ashish Tiwari, Oracle-Guided Component-Based Program Synthesis in ICSE, 2010