Modulo scheduling with rational initiation intervals

It was great to have Patrick Sittel visit our group earlier this year. This blog post is about the work he and I did during his visit, which has been accepted as a paper (co-authored with Martin Kumm and Peter Zipf) at ASP-DAC 2020. Suppose we wish to fit a row of tiles to a… Continue reading Modulo scheduling with rational initiation intervals


How to draw block diagrams

A huge number of academic papers, particularly in the fields of computer systems/architecture, use some sort of block diagram to give readers an overview of the solution being presented. For instance, about two thirds of the papers presented this year at ASPLOS contained at least one of these diagrams, usually towards the start of the paper.… Continue reading How to draw block diagrams

Greatest hits of PLDI 2018

I had a great time at PLDI 2018 last week. Here is my take on a few of the papers that stood out for me. John Vilk presented a tool called BLeak for finding memory leaks in web browsers. One might think that leak detection is not important in a garbage-collected setting, but Vilk explained… Continue reading Greatest hits of PLDI 2018

Concurrency-aware scheduling for high-level synthesis

What follows is a summary of the main contributions of a paper by Nadesh Ramanathan, George Constantinides, and myself that will be presented at the FCCM 2018 conference. If you want to compute something, you have two broad options: do it in software, or do it in hardware. A custom piece of hardware can give you… Continue reading Concurrency-aware scheduling for high-level synthesis

What do you get if you cross Weak Memory with Transactional Memory?

What follows is a summary of the main contributions of a paper I wrote with Nathan Chong and Tyler Sorensen for the PLDI 2018 conference. This project studies two features of a modern computer, one called out-of-order execution and one called transactional memory. Out-of-order execution is where a computer chooses, for performance reasons, to perform its instructions in an order… Continue reading What do you get if you cross Weak Memory with Transactional Memory?

Who has the most POPL and PLDI papers?

DBLP is an online database of academic publications in computer science and related fields. Handily, it provides a Java API for accessing the data programmatically. In this blog post, I share a few fun facts I discovered while using this API to explore the data that DBLP holds about two of the main conferences on… Continue reading Who has the most POPL and PLDI papers?

Ribbon Diagrams for Weak Memory

In their POPL'17 paper, Shaked Flur, Susmit Sarkar, Christopher Pulte, Kyndylan Nienhuis, Luc Maranget, Kathy Gray, Ali Sezgin, Mark Batty, and Peter Sewell describe a semantics of weakly-consistent memory that copes (for the first time) with mixed-size memory accesses. In this blog post, I describe how their semantics can be explained rather nicely with some graphical… Continue reading Ribbon Diagrams for Weak Memory

Translating lock-free, relaxed concurrency from software into hardware

What follows is a summary of the main contributions of a paper I wrote with Nadesh Ramanathan, Shane Fleming, and George Constantinides for the FPGA 2017 conference. Languages like C and C++ allow programmers to write concurrent programs. These are programs whose instructions are partitioned into multiple threads, all of which can be run at the same time.… Continue reading Translating lock-free, relaxed concurrency from software into hardware

Memory Consistency Models, and how to compare them automatically

My POPL 2017 paper with Mark Batty, Tyler Sorensen, and George Constantinides is all about memory consistency models, and how we can use an automatic constraint solver to compare them. In this blog post, I will discuss: What are memory consistency models, and why do we want to compare them? Why is comparing memory consistency… Continue reading Memory Consistency Models, and how to compare them automatically gets acquire/release instructions wrong

It is quite well-known that the memory operations and fences provided by the C/C++11 languages are pretty complicated and confusing. For instance, my OOPSLA 2015 paper (joint with Batty, Beckmann, and Donaldson) explains that a proposed implementation of these instructions for next-generation AMD graphics cards is faulty because the designers misunderstood how they worked. And… Continue reading gets acquire/release instructions wrong