Saturday, June 13, 2009

FIT 2009 Overview

This year's PLDI in Dublin, Ireland features a new session called FIT, or Fun Ideas and Thoughts. FIT follows in the spirit of sessions held at other conferences that feature Wild And Crazy Ideas (aka WACI). As FIT organizers, we believed that such venues can nurture budding ideas, and evidence has shown that ideas presented in such venues eventually materialize into high quality conference publications.

We evaluated 14 submissions in response to the call for Fun and Interesting Thoughts, and selected 7 for presentation. Of the 14 submissions, we received permission to post 10: the 7 selected for presentation, and 3 that were not selected. For each of the posted papers, we have also included a short summary of the reviewer comments. We encourage everyone to comment on the ideas (and the reviewer comments), but please do so respectfully and critically: convince the reader why your point of view is valid.

We hope that this online forum fosters ongoing discussion of fun and interesting thoughts and in turn provides positive feedback to the authors, potentially leading to new research, collaboration, and opportunities.

Your interest and feedback will make FIT a success.

Simple thread semantics require race detection

Hans Boehm (HP Labs, USA)

Summary of reviewer comments:
The idea is to prohibit run-time races in programs and require run-time race detection to simplify language semantics. In other words, one could design a language assuming the fact that data race detection is possible. So instead of spending years arguing about the fine point of memory semantics that no one can understand and few pay attention to, we can attack the underlying problem by prohibiting data races in programs and require a language implementation to detect these errors. A great idea that makes a lot of sense. Too bad it is still too costly.

Mutable state in parallel programs

by Matteo Frigo (Cilk Arts, USA)

Summary of review comments:
The paper argues that reducers (adding multiple instance values and reductions to a programming languages) are a one way to effectively and efficiently handle mutable state in parallel programs. The core ideas are arguably not entirely new and have been extensively explored in the Ultracomputer and parallelizing compiler research of the 80's, but it is time to rediscover them it again. The paper takes these ideas further particularly in conjunction with fork-join parallelism.

Can Computers be Programmed Productively in the Post-Dividend Era?

by Rastislav Bodík, Justin Bonnar (University of California Berkeley, USA) and Doug Kimelman (IBM T.J. Watson Research Center, USA)

Summary of review comments:
The authors argue that programming abstractions that have lead to programmer productivity are not necessarily energy efficient. So in the context of energy constrained computing devices (i.e., when battery life is short), there must be a better way of implementing the intended computation. They outline areas of focus that they are pursuing, which can be explained as (static) specialization, direct refinement from high-level abstraction, and ML-style static languages.

A block-based bytecode format to simplify and improve Just-in-Time compilation

by Christian Wimmer (University of California at Irvine, USA)

Summary of reviewer comments:
The paper advocates byte code representations specified at the basic block, rather than method, granularity. The growing popularity of JIT compilation justifies an effort to design a bytecode language suitable for a spectrum of languages. The paper goes on to list design criteria for a new bytecode format. This may be insightful to the reader who may learn from this abridged tutorial on what demands are placed on modern bytecode languages. The paper does not yet reflect a broader analysis of limitations of existing bytecodes, including CLR's CIL and LLVM, and lacks a deeper semantic analysis of bytecode limitations.

A determinizing compiler

by Nalini Vasudevan and Stephen Edwards (Columbia University, USA)

Summary of reviewer comments:
Stated simply, the authors argue that compilers should generate deterministic code from parallel programs. The idea is to transform multi-threaded programs with non-deterministic parts into fully-deterministic programs using barrier-insertions.

Even though this paper is technically thin and may not reflect the level of understanding that the community has about the problem, the presentation may nonetheless stimulate a productive discussion. For example, is is possible to automatically synchronize a program sufficiently to avoid non-determinism while also avoiding-deadlock? Is it likely that determinizing algorithms may not be robust against program evolution? The programmer may slightly extend the program and be surprised that that the determinizer now picked an execution with a different behavior, one that is still correct but different from the one selected by the determinizer in the previous version of the program.

Gradual programming: bridging the semantic gap

by Bor-Yuh Evan Chang and Amer Diwan and Jeremy Siek (University of Colorado at Boulder, USA)

Summary of reviewer comments:
The paper defines the "semantic gap": a mismatch between what programmers want to express in their code and what the language semantics allows them to describe. The authors provide some examples of semantic gaps, such as limitations of current static type systems, and inability to take advantage of certain program invariants. The paper outlines a solution that would allow the programmer to gradually add information about the program, allowing better optimization and program comprehension. The proposed solution is based on custom composition of language semantics in the form of "dimensions" and "plugins". The paper indirectly addresses an important issue: how to build extensible software by means of developing custom languages (the paper does it in the form of customizing the language semantics).

One representation to rule them all

By Paul Biggar and David Gregg (Trinity College Dublin, Ireland)

Summary of review comments:
The authors sketch an algorithm for constructing an SSA intermediate representation while performing analysis that typically preceded SSA construction. The goal is to combine the benefits of SSA (sparse representation) with costly analysis so that the analysis is positively impacted (faster, more precise).

SSA has some nice properties that serve some purposes well, but it's not obviously clear that it's always practical. Is an SSA-based representation *the* representation to "rule them all". For example, is register spilling any simpler for code in SSA form compared to standard code? Similarly, flow-insensitive pointer analyses are already sparse and HSSA based on them may be precise enough. How much are we losing by ordering the analyses? Further, it's not clear why the authors want to build on SCCP when incremental SSA construction algorithms exist.

A proposal for targeting streaming languages with the value state dependence graph

by James Stanier and Des Watson (University of Sussex, UK)

Summary of review comments:
The Value State Dependence (VSD) Graph is a dataflow-like representation of imperative code and as such it is a natural representation for streaming/vectoring transformation. There is prior art to transform sequential imperative codes into code with parallel semantics similar to Kahn networks or stream languages. The authors distinguish themselves by proposing to start from the value dependence graph instead. This approach is similar in spirit to recent work from Thies et al. in MICRO 2007. The paper is a very short summary of technical results and does not yet show any surprising insights. This work may turn into a nice PLDI submission in the future.

Dependency-driven parallel programming

by Eva Burrows and Magne Haveraaen (University of Bergen, Norway)

Summary of review comments:
The paper describes a method for describing (i) the dependences of a parallel computation with a dependence algebra; (ii) the hardware architecture with a space time algebra; and (iii) the physical computation as an embedding of (i) into (ii). Even though this seems like the old-style embedding of a program's communication topology into the hardware topology, there may be something new here but it's not clear why this is fundamentally different from the large amount of prior work in this area e.g., data-flow languages or languages based on equations to generate systolic arrays.

Towards a new definition of object types

by Cong-Cong Xing (Fantasia International Incorporated, US)

Summary of review comments:
The paper describes a complex type system that can distinguish between certain object types that provide methods of the same signature, but behave differently in ways that may matter to the client. It's unclear why this particular distinction is exceptionally important: the client generally cares about all sorts of semantic properties that are not reflected in the type system. It is also generally very difficult to get complex type systems accepted into widely used programming languages. So while the paper provides a somewhat novel technique of modeling the type structure as a graph, and modeling subtyping as a graph algebra, it isn't clear that the type system is complete and solvable.