Seminar | D3S

The D3S Seminar is an irregular meeting event of the department members and guest speakers. It is also a course Advanced Topics in Distributed and Component-Based Systems I, II NSWI057, NSWI058. This course is recommended for Ph.D. and advanced graduate students.

Meetings take place on Tuesdays at 14:00 in S510 (if not noted otherwise in the schedule below). Each seminar is announced in advance at this webpage. We recommend subscribing to the seminar mailing list to receive announcements on special seminars, schedule updates and other important news.

Scheduled Seminars

No seminars are currently scheduled.

Passed Seminars

Camille Gobert (Université Paris-Saclay)

June 09, 2026

Towards postmodern protean interaction with computer languages

We use computer languages for a variety of tasks, such as writing programs and authoring documents. To interact with code written in such languages, we typically represent and edit it using textual notation, even in contexts where text editors are inadequate, such as when modifying a table or defining a colour. Alternatives to text have been proposed for more than 50 years, but they were never adopted at scale.

In this talk, I will argue that this is partly because these alternatives were too “uniform” (requiring all code to be represented using a single notation) and partly because they were designed in a “modernist” fashion (in isolation from dominant ecosystems and their constraints). I will then present the work that I conducted to start addressing these limitations by exploring ways to make our interaction with computer languages more “protean” (capable of representing different fragments of code in different ways) using a “postmodern” approach (compatible with established languages and practices, including their flaws and limitations). I will conclude with a few ideas and challenges for future work in that domain.

Jan Tušil (Masarykova univerzita)

June 02, 2026

Formal Semantics for Lazy People: Avoiding implementation work using formal models of programming languages

There exists a large variety of programming languages. Each language requires specialized tool support in the form of compilers, interpreters, verification tools, but also program logics, which presents substantial development and maintenance cost. Various mathematical and software tools were proposed to address this issue. This talk takes a focus on "programming language semantics frameworks" (PLSFs) - meta-tools which use explicit representations of formal programming language semantics to aid and automate development of concrete tools for concrete languages. We will survey the breadth of the area, then zoom in to see selected techniques in more detail. First, we examine a recent approach for hyperproperty verification - a program logic that can be instantiated for arbitrary deterministic programming languages with minimal additional work. Second, we discuss an ongoing work to enable reasoning about pointer-manipulating programs in the spirit of separation logic inside PLSFs, and the associated challenges. Familiarity with operational semantics is helpful but not required.

Andrej Pečimúth

May 19, 2026

Replay Canary: JIT Regression Diagnosis and Optimization Mining

Tracking JIT compiler performance is difficult because ordinary benchmark runs do not compile exactly the same methods across runs and compiler revisions. The replay canary addresses this by recording compiler inputs from benchmark runs and replaying them deterministically against different compiler revisions. This makes compilation-level metrics such as compile time, allocated memory, and generated code size easier to compare and reproduce. In this talk, I will show how the replay canary is integrated into compiler development, how it helped diagnose a surprising regression that initially looked like an improvement, and how the same replay corpus can be used as an evaluation harness for AI-assisted compiler optimization mining. The talk includes a case study from GraalVM and early results from applying an autoresearch loop to conditional elimination optimizations.

Matúš Maďar

May 05, 2026

IronSign: Multi-Chain MPC Framework

IronSign is (soon to be open) prototype of a multi-chain MPC/TSS signing for blockchain transactions. It explores how custodial systems can eliminate the single point of failure of a centrally stored private key by distributing signing across multiple nodes while still producing a standard single-signature output. Its design separates blockchain-specific transaction handling from the cryptographic signing core, enabling reuse across chains that share ECDSA over secp256k1, especially Bitcoin and Ethereum. A central architectural feature is the use of presignatures, which shift the expensive interactive part of threshold signing into an initialization phase and make runtime signing lightweight. The prototype also incorporates an air-gapped offline node as an additional security layer and demonstrates the full workflow on Bitcoin regtest, from transaction construction to final blockchain acceptance.

Sidney Congard (Nantes Université)

March 17, 2026

Resource-oriented programming in presence of errors

Programming with resources (e.g. allocations, files, locks, ...) induces many challenges. Despite claiming to address them, linear types have not been adopted in mainstream programming languages. In this talk, I will first discuss the notion of resource, formalize the practical issue of combining linearity and errors, then skim through ways of dealing with errors in presence of resources. Then, I will fix a linear calculus with resource-safety properties that model memory allocations. I then implement destructors and move semantics with a translation in this calculus. This translation is explained by call-by-push-value and resource modalities, hence providing a Curry-Howard correspondence for destructors that clarifies the affine, linear and ordered aspects of such language. Finally, I introduce mutable borrowing and sketch a linear functional translation to capture mutable borrowing.

Bio: Sidney Congard is a doctoral student under Guillaume Munch-Maccagnoni and Rémi Douence in the Inria Gallinette team, at Nantes University, France. He graduated from an computer engineer school, worked as a C++ programmer for 3 years, then graduated from a mathematical logic master. He is studying programming languages with resources from the point of view of the Curry-Howard correspondence, in particular destructors and borrowing. He also dabbles in type theory, and is more generally interested in various aspects of deductive sciences and their context.

David Corfield (University of Kent)

February 03, 2026

Safeguarding via category-theoretic systems theory

The field of 'applied category theory' has flourished since around 2010. Here beyond earlier applications of the mathematical language of category theory to mathematics itself, and to logic, computer science and physics, practitioners have been using it to study topics such as dynamical systems, database theory, natural language processing, cognition, systems biology, epidemiology, chemical reaction networks, game theory, and robotics. In 2024, the UK research agency ARIA launched a program whose aim is to devise software capable of representing the organisation of composite cyber-physical systems with a view to provide guarantees for system behaviour. In this talk I will explain some of the ideas behind the program.

Bio: David Corfield worked in academia for many years as a philosopher, most recently at the University of Kent. His research interests concern practice- and history-oriented approaches to the philosophy of mathematics, science, and medicine. He has authored numerous articles and three books ('Towards a philosophy of real mathematics', 2003; 'Why do people get ill?' 2007 with D. Leader; 'Modal homotopy type theory', 2020). He has had a long-term interest in category theory for over 30 years and is currently funded by ARIA's Safeguarded AI program.

Andrej Pečimúth

January 20, 2026

Replay Canary: Continuous Regression Detection for JIT Compilation Metrics

Compiler engineers continuously monitor the performance of the compilation process, specifically compilation time, memory footprint, and code size. For JIT compilers, minimizing these metrics is critical as they compete for hardware resources with the host application. However, obtaining accurate measurements is computationally expensive due to the high variance caused by non-deterministic compiler inputs across virtual machine (VM) runs. We propose a systematic approach that records all compiler inputs during a benchmark run and replays these compilations across different compiler revisions. By fixing the inputs, we eliminate metric variability and remove the need to execute benchmark applications for every revision, significantly reducing computing costs. The technique can be deployed as a canary test to quickly classify whether a compiler patch changes the tracked compilation metrics, helping developers ensure that accidental regressions are not integrated. We tested the technique in the production GraalVM JIT compiler and found that it can effectively detect small shifts in compilation performance with minimal false positives.

Christoph Kirsch (University of Salzburg)

December 02, 2025

Incremental Bounded Model Checking of RISC-V Machine Code with Binary Decision Diagrams

Symbolic execution is a powerful technique for analyzing the behavior of software yet scalability remains a challenge due to state explosion in control and data flow. Existing tools typically aim at managing control flow internally, often at the expense of completeness, while offloading reasoning over data flow to SMT solvers. Moreover, reasoning usually happens on source code or intermediate representation level to leverage structural information, making machine code generation part of the trust base. We are interested in changing the equation in two non-trivial ways: pushing reasoning down to machine code level, and then offloading reasoning entirely into SMT solvers and other, possibly more efficient solver technology. In more abstract terms, we are asking if bit-precise reasoning technology can be made scalable on software, and not just hardware. For this purpose, we developed two tools called rotor and bitme for model generation and bounded model checking, respectively. We chose RISC-V restricted to integer arithmetic as modeling target for rotor since RISC-V integer semantics is essentially equivalent to established SMT semantics over bitvectors and arrays of bitvectors. While state-of-the-art SMT solvers struggle in our experiments, we have evidence that there is potential for improvement. To show the potential, we have slightly generalized and then implemented in bitme two types of binary decision diagrams (BDDs): algebraic decision diagrams (ADDs) and context-free-language ordered binary decision diagrams (CFLOBDDs). Bitme uses BDDs to propagate program input through models, essentially generalizing constant propagation to domain propagation. SMT solvers only get involved when model input cannot be propagated, significanly speeding up SMT solving. In other words, BDDs enable bitme to apply integer arithmetic natively rather than relying on SMT solving. We then study the impact of incrementally increasing the size of model input in bitme that delays exponential explosion in BDDs to find satisfying model input earlier while maintaining completeness in the limit. This is joint work with Anna Bolotina, Stefanie Muroya Lei, Matthias Pleschinger, and Alireza Sohrabi.

Bio: Christoph Kirsch is Professor at the Department of Computer Science of the University of Salzburg, Austria. He received his Dr.Ing. degree from Saarland University in 1999 while at the Max Planck Institute for Computer Science in Saarbrücken, Germany. From 1999 to 2004 he worked as Postdoctoral Researcher at the Department of Electrical Engineering and Computer Sciences of the University of California, Berkeley. He later returned to Berkeley as Visiting Scholar (2008-2013) and Visiting Professor (2014) at the Department of Civil and Environmental Engineering. Since 2022 he chairs the Programming Research Laboratory at the Faculty of Information Technology of the Czech Technical University in Prague. His research interests are in concurrent programming, memory management, virtualization, and formal verification. Dr. Kirsch co-invented embedded programming languages and systems such as Giotto, HTL, and the Embedded Machine, and more recently co-designed high-performance, multicore-scalable concurrent data structures and memory management systems. He co-founded the International Conference on Embedded Software (EMSOFT) in 2001 and served as ACM SIGBED chair from 2011 until 2013. He has been IEEE TCAD and ACM TODAES associate editor, and is ACM Distinguished Speaker since 2017.

Milad Ashqi Abdullah

October 07, 2025

Prioritizing Performance Bug Resolution Using Multi-Factor Indicators

Efficiently addressing performance bugs is critical to maintaining software quality and user satisfaction. However, developers often face uncertainty when deciding which bugs to resolve first. In this work, we propose an approach to optimize bug prioritization by leveraging three key indicators: the estimated time required to fix the bug, the severity of its impact, and the number of tests that flagged it. Using the Mozilla performance bug dataset, we analyze how these factors can be combined to guide decision-making and improve the efficiency of the debugging process. Our results highlight that a multi-indicator framework enables more systematic prioritization, ensuring that developers focus on bugs with the highest overall cost–benefit trade-off.

Jiří Klepl

September 23, 2025

Noarr for MPI

Message Passing Interface (MPI) has been a well-established technology in the domain of distributed high-performance computing for several decades. However, one of its greatest drawbacks is a rather ancient pure-C interface. It lacks many useful features of modern languages (namely C++), like basic type-checking or support for generic code design. In this paper, we propose a novel abstraction for MPI, which we implemented as an extension of the C++ Noarr library. It follows Noarr paradigms (first-class layout and traversal abstraction) and offers layout-agnostic design of MPI applications. We also implemented a layout-agnostic distributed GEMM kernel as a case study to demonstrate the usability and syntax of the proposed abstraction. We show that the abstraction achieves performance comparable to the state-of-the-art MPI C++ bindings while allowing for a more flexible design of distributed applications.

Tomáš Petříček

September 16, 2025

Denicek: Computational Substrate for Document-Oriented End-User Programming

User-centric programming research gave rise to a variety of compelling programming experiences, including collaborative source code editing, programming by demonstration, incremental recomputation, schema change control, end-user debugging and concrete programming. Those experiences advance the state of the art of end-user programming, but they are hard to implement on the basis of established programming languages and system.

We present Denicek, a computational substrate that simplifies the implementation of the above programming experiences. Denicek represents a program as a series of edits that construct and transform a document consisting of data and formulas. Denicek provides three operations on edit histories: edit application, merging of histories and conflict resolution. Many programming experiences can be easily implemented by composing these three operations.

We discuss the architecture of Denicek, discuss key design considerations and elaborate the implementation of a variety of programming experiences. To evaluate the proposed substrate, we use Denicek to develop an innovative interactive data science notebook system. The case study shows that the Denicek computational substrate provides a suitable basis for the design of rich, interactive end-user programming systems.

Joel Jakubovic

September 16, 2025

The Unix Executable as a Smalltalk Method

Unix and Smalltalk are very different in the details, but bear curious similarities in their broad outlines. Prior work has made these comparisons at a high level and sketched a path for retrofitting Smalltalk’s advantages onto Unix (without compromising the advantages of the latter). Everybody seems to agree on identifying the Unix file with the Smalltalk object, but this still leaves much unspecified. I argue that we should identify the Unix executable with the Smalltalk method. A Smalltalk VM implementation via the filesystem falls out quite easily from this premise; however, the severe overhead associated with Unix processes casts doubt on its practical realisation. Nevertheless, we can see several ways around this problem. The connection shows promise for realising the benefits of Smalltalk within Unix without sequestering the former in a hermetically sealed image and VM.

View slides

Pablo Donato (Grothendieck Institute)

June 10, 2025

Deep Inference for Graphical Theorem Proving

Contemporary proof assistants such as Rocq, Lean and Agda provide robust frameworks for constructing and verifying formal proofs, and can also be used as very expressive programming languages thanks to their powerful type systems. However their interfaces remain largely textual, and require users to master mathematical logic and functional programming. This poses significant usability barriers, limiting both broader adoption and more exploratory, human-centered forms of reasoning. This talk introduces Proof-by-Action (PbA), a novel paradigm developed in my PhD thesis for interacting with proof assistants through direct manipulation in a graphical user interface. Grounded in the proof-theoretic framework of deep inference, PbA allows users to construct proofs via intuitive gestures—such as clicking and dragging—performed directly on logical statements. A live demonstration in Rocq will showcase the integration of this paradigm into a state-of-the-art proof assistant, enabling gestural proof construction within a trusted backend and a rich library ecosystem. I will then present a more ambitious research direction: replacing the traditional symbolic notations of logic with iconic representations that leverage our spatial intuition. I focus on a long-neglected diagrammatic formalism introduced by C. S. Peirce at the end of the 19th century—existential graphs—and on my intuitionistic extension of them, the Flower Calculus. I will conclude with my latest work toward a Curry-Howard interpretation of existential graphs, which I hope will pave the way for a new kind of interactive programming systems that are both strongly typed and genuinely user-friendly.

Bio: Pablo Donato is a postdoctoral researcher in computer science currently based in Paris. His research focuses on the design of graphical proof languages and interactive tools for formal reasoning, with a particular interest in rethinking the user experience of proof assistants and programming systems. During his Ph.D., he developed the Proof-by-Action paradigm, a novel approach to proof construction based on direct manipulation and the proof-theoretic framework of deep inference. He also introduced the Flower Calculus, a diagrammatic system inspired by C. S. Peirce's existential graphs. His goal is to make correct-by-construction programming more feasible by redesigning our notations for the dynamic medium. More information can be found on his website: https://pablogician.refl.fr/.

Andrej Pečimúth

May 20, 2025

A Pragmatic Approach to Replay Compilation (Rehearsal Talk)

Dynamic compilers generate code based on the information provided by the virtual machine (VM) running the corresponding application. Due to the environment's non-deterministic nature, every compilation result is typically unique. This is a problem when reproducibility is desired, such as when debugging a crash of the JIT compiler or diagnosing performance problems. As a solution, we present a pragmatic approach to replay compilation that is suitable for integration in a production-grade VM. Our approach is based on instrumenting the VM's compiler interface, allowing us to record the compiler's queries and their results to the VM. We serialize them and use them to replicate the compiler's query results in a replayed compilation. Assuming the compiler is deterministic, this approach systematically ensures that the replayed compilation result is equivalent to the recorded one. The dynamic compiler is invoked directly without the need to execute the original application. A compiler developer can replay a compilation with additional diagnostic options or evaluate metrics such as compilation speed. We developed a working prototype for GraalVM, showing that replay compilation can be implemented without requiring extensive compiler or VM changes. We are working with the GraalVM developers to integrate it into the open-source compiler to unlock these benefits and new use cases for the community.

Yannic Noller (Ruhr University Bochum)

May 15, 2025

Automated Program Repair for Security

Security vulnerabilities detected via techniques like greybox fuzzing are often fixed with a significant time lag. This increases the exposure of the software to vulnerabilities. Therefore, we need technology that can automatically propose fixes for software vulnerabilities. In this talk, I will introduce automated program repair for security and highlight our recent achievements in this area. In particular, I will present our work on fixing security vulnerabilities related to division-by-zero, overflows, null pointer dereferences, etc., using concolic execution, specification inference, and search techniques. This approach avoids generating fix suggestions merely at the crash location because such fixes often disable the manifestation of the error instead of fixing the error. Instead, based on sanitizer-guided concolic execution, we infer desired constraints at specific program locations and then opportunistically search for code mutations that help respect those constraints. Further, I will dive into the detection, quantification, and repair of errors that cannot be easily detected with sanitizers, like side-channel vulnerabilities in the source code. For fixing timing side-channel vulnerabilities, our approach uses a quantitative estimation of found vulnerabilities to guide the fix localization, which goes hand-in-hand with a pattern-guided repair. Overall, this approach integrates vulnerability detection, quantification, localization, and repair into one unified process.

Bio: Yannic Noller is a professor at the Faculty of Computer Science (https://informatik.rub.de/en/) at the Ruhr University Bochum (https://www.ruhr-uni-bochum.de/en) (RUB) and leads the Software Quality (https://informatik.rub.de/en/sq/) group. His research focuses on how software quality can be maintained and improved with automated testing and repair technologies. His general research goal is to shape the future of software development by contributing to the domain of automated software engineering and providing the means to develop reliable, trustworthy, and secure software systems. Before joining RUB in July 2024, Yannic was an Assistant Professor at Singapore University of Technology and Design (https://www.sutd.edu.sg/) (SUTD) and a Research Assistant Professor in the Department of Computer Science at the National University of Singapore (https://www.nus.edu.sg/) (NUS). He pursued his Ph.D. in Computer Science in the Software Engineering group (advised by Prof. Lars Grunske (https://www.informatik.hu-berlin.de/de/Members/lars-grunske)) at the Humboldt-Universität zu Berlin (https://www.hu-berlin.de/en), Germany. His Ph.D. research focused on differential software testing, in particular, by combining fuzzing and symbolic execution in the context of regression analysis, algorithmic complexity analysis, side-channel analysis, and robustness analysis of neural networks. More information can be found on his website: https://yannicnoller.github.io/

Matyáš Brabec

May 13, 2025

Slaying a Life: Optimizing GPU-accelerated Game of Life Stencil

David Kozák (Oracle Labs)

April 29, 2025

Deep Dive into Static Analysis in GraalVM Native Image

GraalVM Native Image is an ahead-of-time compilation technology that transforms Java applications into standalone native executables. At its core, Native Image performs points-to analysis to determine the application's reachable code and data. This process is tightly integrated with build-time class initialization and heap snapshotting to optimize startup time and resource usage. In this lecture, we will explore the inner workings of the static analysis in Native Image, examining how it drives the compilation process and interacts with other subsystems. We will share insights gained from applying the analysis to complex, large-scale applications and highlight key challenges and trade-offs. Finally, we will discuss ongoing research directions and future developments aimed at improving the precision, scalability, and usability of static analysis in GraalVM Native Image.

Adriana Jubera

April 22, 2025

Enhancing ECG Signal Classification with Recurrent Neural Network

This work explores the application of recurrent neural networks (RNNs) in the analysis of electrocardiography (ECG) data, with a focus on detecting abnormal cardiac patterns. Using the MIT-BIH Arrhythmia Database, two Long Short-Term Memory (LSTM) models are developed to classify ECG signals and identify anomalies. The first model is a basic LSTM, while the second incorporates an advanced architecture combining convolutional layers and an attention mechanism, enhancing the model’s ability to capture both spatial and temporal patterns. We compare the performance of these models in terms of accuracy, loss, and training time, and demonstrate that the enhanced model with attention and convolution outperforms the basic LSTM, achieving 96.82% accuracy compared to 95.42% for the basic model. Despite the additional computational cost, the improved model provides better generalization, making it a suitable choice for real-world applications in cardiac anomaly detection. This study highlights the potential of LSTMs in healthcare, particularly in automated ECG analysis for disease prediction.

Stephen Kell (King's College London)

March 27, 2025

How debuggable is your (compiler-optimised) program?

Source-level debugging of compiled code only works when compilers generate the necessary metadata. Currently, that means it rarely works well, at least in optimising ahead-of-time compilers like LLVM and GCC. I'll give an overview of how compiler-generated metadata enables source-level debugging, the challenges of making it work for optimised code, and our recent work on doing better. Whereas compilers have so far taken a "best-effort" approach with no particular correctness criterion, I'll outline a correctness condition for local variable information that seems to balance the relevant trade-offs. I'll then describe a tool we've built that can use this to mechanically find valid LLVM bugs capturing avoidable losses or corruptions of debug info. A theme will be how the textbook framing of compiler optimisations as "eliminating" code or variables could be more constructively thought of as "residualising" them into debug info; I'll finish with some thoughts on what that could mean for how compilers are built. All this is joint work with J. Ryan Stinnett.

Alan Mycroft (Cambridge University)

March 26, 2025

Points for Free: Embedding Pointful Array Programming in Python

Alan will talk about joint work with Jakub Bachurski (link). Multidimensional array operations are ubiquitous in machine learning. The dominant ecosystem in this field is centred around Python and NumPy, where programs are expressed with elaborate and error-prone calls in the point-free array programming model. Such code is difficult to statically analyse and maintain. Various other array programming paradigms offer to solve these problems, in particular the pointful style of Dex. However, only limited approaches—based on Einstein summation—have been embedded in Python. We introduce Ein, a pointful array DSL embedded in Python. We also describe a novel connection between pointful and point-free array programming. Thanks to this connection, Ein generates performant and type-safe calls to NumPy with potential for further optimisations. Ein reconciles the readability of comprehension-style definitions with the capabilities of existing array frameworks.

Grigory Fedyukovich (Florida State University)

December 02, 2024

Maximizing Branch Coverage with Constrained Horn Clauses

State-of-the-art solvers for constrained Horn clauses (CHC) are successfully used to generate reachability facts for software using its symbolic encoding. In this talk, I will present a new application of CHCs to test-case generation, a problem of finding a set of tuples of input values to a program under which the program visits as many branches as possible. The key insight to achieve maximality is to identify and skip blocks of code that are provably unreachable. The new approach to test case generation called HORNTINUUM uses CHC to construct different program unrollings incrementally and extract test cases from models of satisfiable formulas. At the same time, a CHC solver keeps track of CHCs that represent unreachable blocks of code, making the unrolling process more efficient. In practice, this lets HORNTINUUM terminate early while guaranteeing maximal coverage. HORNTINUUM exhibits promising performance: it generates high coverage in most cases and takes less time on average than state-of-the-art based on bounded model checking, concolic execution, and/or fuzzing.

Bio: Grigory Fedyukovich is an Assistant Professor at Florida State University. He completed his Ph.D. at the University of Lugano under the supervision of Prof Natasha Sharygina, a postdoc at the University of Washington with Prof Rastislav Bodik, and a postdoc at Princeton University with Prof Aarti Gupta. His main research interests are in the fields of automated reasoning, software verification, and synthesis.

Aleksander Boruch-Gruszecki

October 15, 2024

Gradient: Gradual Compartmentalization via Object Capabilities Tracked in Types

Modern software needs fine-grained compartmentalization, i.e., intra-process isolation. A particularly important reason for it are supply-chain attacks, the need for which is aggravated by modern applications depending on hundreds or even thousands of libraries. Object capabilities (ocaps) are a particularly salient approach to compartmentalization, but they require the entire program to assume a lack of ambient authority. Most of existing code was written under no such assumption; effectively, existing applications need to undergo a rewrite-the-world migration to reap the advantages of ocap. We propose gradual compartmentalization, an approach which allows gradually migrating an application to object capabilities, component by component in arbitrary order, all the while continuously enjoying security guarantees. The approach relies on runtime authority enforcement and tracking the authority of objects the type system. We present Gradient, a proof-of-concept gradual compartmentalization extension to Scala which uses Enclosures and Capture Tracking as its key components. We evaluate our proposal by migrating the standard XML library of Scala to Gradient.

Andrej Pečimúth

October 15, 2024

An Analysis of Compiled Code Reusability in Dynamic Compilation (Rehearsal Talk)

Large applications reliant on dynamic compilation for performance often run in horizontally scaled architectures. When this is combined with frequent deployment or demand-based scaling, hardware capacity is lost to frequent warmup phases due to the need to recompile the code after each start of the virtual machine (VM). Moreover, the individual VMs waste hardware resources by repeating the same compilations. Offloading compilation jobs to a dedicated compilation server can mitigate these problems. Such a server can compile the code in a mode where the compilation result is reusable for multiple VMs. The goal is to save compilation resources, such as CPU and memory, and potentially improve the warmup time of individual VMs. This paper investigates the options to reuse previous compilation results within a high-performance VM. We present an empirical study using the GraalVM compiler and the HotSpot Java VM. To facilitate code reuse, we introduce an approach that compiles code into a reusable high-level intermediate representation (IR). This approach defers VM-specific optimizations until the time of reuse. The incurred slowdown of such code varies by workload, ranging between a negligible impact and a 6x slowdown. Although deferred optimization impacts the efficiency of particular code patterns, such reused code still performs significantly better than that compiled by a lower-tier compiler. Therefore, the presented approach can form the foundation for improving warmup times in certain workloads.

Jan Liam Verter

October 08, 2024

Don’t Call Us, We’ll Call You (Towards Mixed-Initiative Interactive Proof Assistants for Programming Language Theory)

There are two kinds of systems that programming language researchers use for their work. Semantics engineering tools let them interactively explore their definitions, while proof assistants can be used to check the proofs of their properties. The disconnect between the two kinds of systems leads to errors in accepted publications and also limits the modes of interaction available when writing proofs. When constructing a proof, one typically states the property and then develops the proof manually until an automatic strategy can fill the remaining gaps. We believe that an integrated and more interactive tool that leverages the typical structure of programming language could do better. In the early work presented in this paper, we focus on the problem of interacting with a proof assistant. Rather than starting with manual proof construction and then completing the last steps automatically, we propose a way of working where the tool starts with an automatic proof search and then breaks when it requires feedback from the user. Our early experience suggests that this way of working can make proof construction easier.

Maya Mückenschnabel

October 08, 2024

Algebraic Effect Handlers with Bidirectional Type-Checking

In the development of type systems there are multiple paths for achieving more expressiveness. On the one hand, algebraic effect handlers allow us to reason about program's side effects. On the other hand, dependent types make it possible to more precisely reason about program states and data and to prove general mathematical statements. Our work is concerned with the development of a practical Lisp-based programming language that combines dependent types with effect handlers. More specifically we focus on describing the type system that has the two following features. First it combines the algebraic effect system with the bidirectional type-checking algorithm in order to support effects in dependent types. Second, it introduces further type inference rules for greater flexibility.

Adam Šmelko

September 24, 2024

Employing Parallel Computing in Data-Intensive Tasks (PhD. defence rehearsal talk)

Michal Töpfer

May 28, 2024

Can LLMs Understand Software Architectures?

This talk will focus on two examples of software architectures and whether large language models (LLMs) can understand them. First, we will look at workflow architectures represented by experiment workflows from the ExtremeXP project. The second part of the talk will be about the DEECo ensemble-based component model and what to expect from LLMs in this context.

Andrej Pečimúth

May 21, 2024

Remote JIT compilation with code caching

Large, complex applications based on dynamic runtimes can be inefficient in deployments with frequent virtual machine (VM) restarts, such as serverless computing or deployments with demand-driven scaling. The problem is the need to recompile code on each VM startup so that the application can reach peak performance. We propose to offload the compilations to a separate server and reuse the information from previous compilations. The goal of this approach is to save resources and speed up how long it takes each VM to start.

Milad Ashqi Abdullah

May 07, 2024

Robin

Systematic literature mapping is an essential part of research methodology. Conducting a systematic literature mapping is challenging. Researchers query publications from various sources, which need to be filtered, categorized, and cleared of duplicates. It is usually the case that the number of publications ranges between hundreds to thousands. The whole process is often performed iteratively and repeatedly, especially at the start of the mapping study, which only further increases the effort. When a team of researchers conducts a mapping study, the members may have different opinions on filtering and categorizing papers, which must be resolved. To our knowledge, this problem is very poorly supported by open-source tools. To address these issues, we present a tool called Robin which facilitates managing the steps of conducting a mapping study within a team. It provides search tools, categorization, and a platform for team members to define their criteria for including and excluding papers. In addition, Robin is connected to publicly available publications search platforms such as IEEE API and Scopus API. Robin is written in Python-Django and can be installed as a web service.

Jiří Klepl

April 23, 2024

Pure C++ Approach to Optimized Parallel Traversal of Regular Data Structures

Cristina Abad Robalino (Escuela Superior Politécnica del Litoral in Guayaquil-Ecuador)

March 26, 2024

Designing Serverless Platforms to Support Emerging Applications

Serverless computing offerings from cloud providers have gained significant traction in recent years due to the advantages that these platforms bring with their flexible pricing models, built-in scalability, and minimal operational requirements. In a recent survey of serverless use cases, we found a wide variety of applications that depend on these services, including implementing the core functionality at the backend of mobile applications, automating the DevOps tasks of complex distributed applications, real-time processing of IoT streaming data, and scientific applications. To properly support these applications, the platforms should be fast, self-managing, and provide support for diverse QoS requirements. As a result, novel improvements to serverless platforms are rapidly being proposed and adopted. Evaluating these solutions necessitates application-based, workload-aware benchmarking tools that the community can rely on. This talk addresses these challenges and our research efforts on tackling them, presenting a performance engineering perspective about the current state and future challenges of serverless computing research. I will describe our solutions in resource management for serverless platforms, focusing on solutions that improve performance or reduce costs via scheduling, caching, and right-sizing of resources, along with our ongoing efforts in developing an application-driven serverless benchmark.

Jan Vitek

October 17, 2023

Two talks for the price of one!

Two rehearsal talks for the OOPSLA conference. First on “Reusing JIT compiled code” an OOPSLA talk where we show how to soundly reuse previously compiled code across programs. Second an invited talk for the Dynamic Language Symposium titled “Prof. Strangelove Or: How I stopped worrying and started to love dynamic languages” where I will tell you about my experience teaching, researching and funding work on strange languages.

Andrej Pečimúth

October 10, 2023

Remote JIT Compilation / Diagnosing Compiler Performance

This talk is a rehearsal for a conference consisting of two parts. In the first part, we introduce remote JIT compilation. Remote JIT compilation for dynamic languages addresses the challenges related to resource usage of the compiler and warmup. In the second part, we describe an approach to diagnose performance issues in dynamic compilers. We do this by capturing and comparing the compiler's optimization decisions.

Darius Blasband (CEO of Raincode)

September 21, 2023

Compiler-Integrated Meta-Programming For Legacy Languages

Most meta-programming systems operate on small volumes of code, written in idealized languages, processed in controlled and friendly environments. At the opposite end of the spectrum, this presentation describes an industrial-grade mechanism integrated in a commercial product to be deployed on tens or hundreds of unsupervised machines, and how it addresses the challenges that come with huge code bases written in antiquated and poorly defined languages. A number of real-world uses cases are described to validate the approach.

Philipp Rümmer (University of Regensburg)

May 23, 2023

Black Ostrich: Web Application Scanning with String Solvers

Securing web applications remains a pressing challenge. Unfortunately, the state of the art in web crawling and security scanning still falls short of deep crawling. A major roadblock is the crawlers’ limited ability to pass input validation checks when web applications require data of a certain format, such as email, phone number, or zip code. This talk presents Black Ostrich, a principled approach to deep web crawling and scanning. The key idea is to equip web crawling with string constraint-solving capabilities to dynamically infer suitable inputs from regular expression patterns in web applications and thereby pass input validation checks. To enable this use of constraint solvers, we develop new automata-based techniques to handle complex real-world regular expressions, including support for the relevant features of ECMA JavaScript regular expressions. We implement our approach by extending and combining the Ostrich constraint solver with the Black Widow web crawler. Joint work with Benjamin Eriksson, Amanda Stjerna, Riccardo De Masellis, and Andrej Sabelfeld, to appear in the ACM Conference on Computer and Communications Security (CCS) 2023.