Programming has a rich history and knows no limitations! Join us not only to make existing programming tools better, but also to envision new ways of constructing programs and explore the rich history of the discipline!
Our project ideas cover topics such as type systems, but also interactive programming environments or programming in the context of data science.
Do you want to learn more about who we are and what we do first? Check out the Programming Langauges and Systems page and our list of recently defended student theses from the area of programming languages and systems!
Our Research Areas
Better Tools for Data Analysis and Visualization
Programming tools for data analysis and data visualization are also programming, but the kind of code that data analysts write often differs ordinary code written by programmers. They also have different way of working and often start from concrete data and write code interactively. There is a lot of work to be done on understanding how data analysts write code, analysing code they write and building better tools for this kind of programming.
-
Analysing Real-World Data Science Code. How exactly data analytical code looks like? We can find out by analysing data scraped from GitHub. A concrete project can then tackle different questions such as (1) how does data science code (e.g., Python in Jupyter notebooks) differ from ordinary code (e.g., Python libraries and applications), (2) what language features are used in data analysis code, or even (3) can we automatically detect certain kinds of bugs?
-
Tracking Provenance in Data Analysis and Visualizations. Data analyses and visualizations should make it possible to see how the visual representation links to original data source (e.g., what contributed to the height of a bar chart). The aim of the project is to build small data visualization language/tool, inspired by Fluid [2], that can track provenance (source of data) through simple data transformations.
-
Data Visualizations to Encourage Critical Thinking. How can we visualize data so that the result makes viewers think more critically about what they see? A nice example of this is the You Draw It visualization by New York Times [1]. How can we built other visualizations like this? And could we also encourage readers to critically think about model behind the data (e.g. for Agent-based economic models)?
References
- [1] You Draw It: What Got Better or Worse During Obama’s Presidency - nice data vis!
- [2] Fluid: data-linked visualisations - explorable data visualizations
Novel Interactive Programming Systems
Most programming is done in conventional imperative, object-oriented or functional languages, but there are many alternative and less well-explored ways of programming! Programs can be created using visual languages, by interacting with values or by specifying constraints. There are many possible projects and theses exploring those alternative ways of programming. The following are a few concrete examples.
-
Constraint-Based Graphical User Interfaces. The idea to construct user interfaces by specifying constraints between visual elements has been around for ages [1], but for some reason, this is not how most GUIs are constructed! The idea of the project is to implement and evaluate this approach using a significant case study to better understand the capabilities and limitations of this approach.
-
Document-Based Programming System. Can we make programming for non-experts easier by embedding it into a document format? Imagine something like Notion where you can also include formulas that refer to other parts of document, for example to compute aggregates from a table included elsewhere in the document. Representing formulas as ordinary elements in the document would add powerful meta-programming capabilities to such systems.
-
Beyond Textual Notations for Programming. Most programming is done using text, but a picture is worth a thousand words! The idea of projects in this area is to explore how to express programs better using visual representation. For example, can we use Block-based languages to define more complex things than simple programs (or even to define other Block-based languages?) Can a visual programming notation leverage the extra information encoded in the two-dimensional structure of the programming environment?
-
Formal Models of Interactive Programming Systems. Programming language theory, type systems and semantics are all created around programming languages - they see programs as formal entities written in some textual language with grammar. But many interesting aspects of programming happen in stateful interactive programming system (think modern IDEs with REPLs and debuggers, Notebook systems for data science or older systems like Smalltalk). Formally modelling such interactive systems is an interesting (a bit more) theoretical problem that can be a good fit for a more academically inclined student.
References
- [1] Constraint-based tools for building user interfaces
- Ink and Switch: An independent research lab exploring the future of tools for thought
- Technical dimensions of programming systems
- Darklang and more videos about it
- Histogram: You have to know the past to understand the present - online demo
- Subtext: uncovering the simplicity of programming - project homepage
Interactive GUI for Programming-Oriented Proof Assistant
Proof Assistants are often used in programming language research to formalize and prove properties of models of programming languages. Those are typically text-based and have only limited interactive capabilities.
The goal of the project is to design an implement interactive graphical user interface for a theorem prover. The GUI should be able to display proofs about programs in a readable way, use the capabilities of the underlying theorem prover to suggest possible steps in proofs (based on the current proof context) and let user easily correct automatically generated parts of proof.
References
- SASyLF: An Educational Proof Assistant for Language Theory
- PLT Redex: Domain-specific language for specifying and debugging operational semantics
F# Libraries, Compilers and Language Extensions
F# is a functional-first language targetting .NET, JavaScript and other runtimes. Projects related to F# can take a number of different forms, ranging from interesting libraries for F# to F# language extensions or projects related to the compilation of F#. If you are interested in F#, we can discuss possible different ideas!
-
Generating Types from Types via Type Providers. Type providers are an F# mechanism that makes it possible to generate types “behind the scenes” based on external information such as database structure (for data access). Currently, they can take only very limited information as inputs (string constants). If the compiler could pass user-defined types as parameters to a type provider [1], this could be used for interesting meta-programming tools. The aim of the project is to modify the F# compiler and showcase the possible uses of the new meta-programming capabilities.
-
Compiling F# to (Whatever You Want). F# has been built for .NET, but thanks to Fable [2], it can target JavaScript and a couple of other ecosystems. There are still a lot of places where one cannot use F# though! The aim of this project is to extend Fable to support various other compilation targets. This could be other language ecosystems (Lua for game scripting), embedded systems (Micro Bit), low-level targets (LLM, Graal, Web Assembly) or weird things (Excel spreadsheets, smart contracts).
References
- [1] F# RFC FS-1023 - Allow type providers to generate types from types
- [2] Fable - Fable is a compiler that brings F# into the JavaScript ecosystem
Programming with LLM-based Assistants
Large Language Models (LLMs) are becoming a standard part of programmer’s toolbox, whether we like it or not. A lot of work can be done on integrating the new LLM tools with conventional programming language research, such as using program verification tools to check correctness of generated programs or using types to improve generated suggestions [1]. We are happy to supervise a range of projects and theses, such as those below, that explore this integration.
-
LLM Integration with Stateful Programming Systems. LLMs are good at generating code, but what if you are in a system that is running and has a live state? Typical example is Smalltalk, but the same applies to debugger or Web Browser Developer Tools. The project would explore how to use LLMs in stateful systems, for example by passing information about runtime state as part of the LLM context or by using LLM not just to write code, but also to suggest other possible interactions with the system (e.g., edit the value of a variable in a debugger).
-
Langauge and Theory for Composing LLM Prompts. To generate larger amounts of code, people often compose chains of LLM prompts (e.g., generate a plan, suggest code for each step or generate code, generate tests, check that they match). This project would look at systems for such composition [2,3] from a programming language perspective. What would a better language for this task look like? Are there any properties about such (meta-)programs that we can study?
References
- [1] Statically Contextualizing Large Language Models with Typed Holes
- [2] GenAIScript: Scripting for Generative AI
- [3] LangChain Expression Language
History and Philosophy of Programming
Programming has rich history! Many great ideas have been lost in the past and deserve recovering. Many programming concepts have evolved in unexpected ways and changed over time. Projects that investigate programming from historical and philosophical perspectives can document the rich history or try to bring back great past ideas.
-
Recovering Ideas from Past Programming Systems - There are many past programming systems that have been forgotten (Commodore 64 BASIC, Hypercard, Boxer, LISP machines), but have interesting ideas in them that could be adapted and used in programming. A project could document such systems using historical sources and reimplement some of the ideas.
-
Document How Programming Concepts Evolve - Concepts like types, processes, objects or events have existed in many different programming environments over time (but also in logic or linguistics). The idea of the project would be to document the how the meaning of core programming concepts evolves and how new technologies, formalisms or languages shape the meaning of programming concepts.
A project like this would put more emphasis on rigorous working with historical sources and quality of writing and may be more suitable for Master’s students with some relevant background or existing interest in the topic.
References
- The Lost Ways of Programming: Commodore 64 BASIC - online reconstruction
- What we talk about when we talk about monads (PDF)
- For more ideas, see HaPoP 5 abstracts and Computing and Programming in Context
- For related work on history of mathematics, see Proofs and Refutations: The Logic of Mathematical Discovery