New Parser using ANTLR

The current code base of Alloy started in the '90’s and has been worked on by a lot of very smart graduate students. Some of the code is absolutely brilliant but the age of the code base is clearly showing. Since Alloy started 30 years ago, a lot has changed. Java went from a bit more than a toy language to a half way decent one (really!). Today it has many concepts out of the box that had to be developed from scratch in Alloy. The code also shows that many generations of students worked at it for a short period; it is often surprisingly hard to find where something is happening. Worst of all, everything seems to be connected to anything.

Although I originally wanted to retire, I don’t use a lot of Alloy modeling anymore, Matthew Di Ferrante was willing to sponsor my work to do further improvements. One the first things we decided to do is to work out the actual grammar and create a new ANTLR parser based on modern technology. (The grammer can also be used in Python, JS, etc.)

For this sponsoring, i will take advantage of the work I did a long time ago: develop an (interface based) API on Alloy. A well defined API will make it easier to understand the Alloy code for educational purposes and simplifies building extensions. Most of the extensions over the last years have been retrofitted in with enormous effort. The Electrum port into Alloy 6 was an underappreciated herculean effort.

API based programming is something I worked on for most my life. It was the reason I fell in love with OSGi and it was the reason I fell in love with Alloy. For the OSGi I wrote thousands of pages of specifications and in retrospect I wish I had had something like Alloy. I am a sucker for concise clear crisp specifications and Alloy could make them verifiable. It was a bit of a surprise to me how the formal design community seems to highly focused on the proof aspect. For me a good Alloy model is worth a thousand pictures.

The reason that I am diving a bit deeper in the API topic is that I feel that Alloy itself could become an example of software engineering with formal methods. My good intention for this year is to continue the work with an Alloy model of Alloy. Anybody that is interested in this angle of Alloy should speak out so we can collaborate on it. Maybe universities that do software engineering and formal methods?

So how will this work go? The strategy of the current work is to not change the existing code base but insert an API between the parts. A Parser API, an AST API, a Solver API, a Visualizer API, a runtime API etc. For each API, a classic provider will created that maps the API to the classic code base. The classic code base might implement some interfaces but remains unchanged and compatible with today’s. However, once an API is in place it will become easier to plugin new implementations. However, all along the way we should remain 100% backward compatible although newer features might not work for older parts.

However, I think our direction should become language server oriented with VS Code as primary front end. This implies that likely little work will be done on the Java UI.

I plan to use this forum as a blog of my progress. I am already itching to tell you about all the nooks and crannies I found in Alloy’s sometimes magical syntax!

Peter Kriens

5 Likes

Thanks for all your efforts. I note that there’s a VS Code initiative for TLA+ too (Eclipse for the new generation…)

1 Like

Hey, @peter.kriens, I don’t remeber where I found this, and you’ve probably seen this, but just in case:

If I understand correctly, it’s a reimplementation of a subset of the Alloy language in Racket.

I think there is still a lot of value in the Java eco system. This is partly a bias from my side due to experience but it also has the advantage we can make one deliverable that runs anywhere.

Last but not least, I do not believe in big bangs, they tend to never end. It is better to do incremental changes that make the current Alloy better and better than betting the farm on some new environment. New environments often make a specific part easier but they require a lot of work on the other parts.

Although I can get upset by many of its implementation details, it is a solid piece of work that is remarkably rich in features.

1 Like

Peter - This is a terrific idea! One that would be very beneficial to efforts to integrate new solvers and extensions with the Alloy code base.

In case it is helpful, here is an ANTLR grammar that a former MMath student of mine wrote to analysis Alloy 5 models: static-profiling-of-alloy-models/src/main/antlr/com/alloyprofiling/ALLOY.g4 at main · WatForm/static-profiling-of-alloy-models · GitHub

Three challenges that we have had when integrating with Alloy are

  1. the contextual meaning of some syntax operators such as “some”, e.g. “some” is internally recognized as “SOME” or “SOMEOF” depending on its context;

  2. there are parts of the internal Alloy AST that are not visible syntactically - we recently ran into an operator called “EXACTLY…” (sorry I can’t remember the “exact” name of it right now :slight_smile: ).

  3. Not enough of Alloy’s AST classes are public and modifiable.

Thanks!

I do have a grammar by now that seems to handle the language. ANTLR 4 is surprisingly powerful.

The hard part is the overloading of the dot operator and the box operator: v.qname[p] has many different interpretations based on what qname refers to after parsing all the modules.If qname refers to a function or predicate, the v can be the this/first parameter of the function/predicate (or not) and the box operator ('[]') can hold the parameters or not. So the expansion v.qname[p] can be:

  • relation(join p (join v qname))
  • fun/0(join p (join v (call qname)))
  • fun/1(join p (call qname v)) or (join v (call qname p))
  • fun/2(call qname v p)

This means that some important operator priority decisions can only be taken when the module is completely parsed. This was a bit challenging. In the current Alloy this delegated to the ChoiceExpr and it hugely complicates the type system and the AST. This design choice forced the Type to be able to have different cardinalities. I think I found a more elegant solution but still need to prove that it is identical.

The new AST will not be modifiable. I am going through great lengths to make it immutable. If you want to build a new AST you need to construct a completely new one with the visitor or listener pattern but the advantage is that you can freely share branches because they’re immutable.

So far I’ve separated the build process of the AST and the final immutable AST. The final AST nodes are completely public Java records and leverage the new Java pattern matching techniques with sealed types. Sealed types make it possible to switch on Java types. Every AST node is a record so there are no records for specific operation, each operation is a record type. The AST node/records have no operations.

There is a ModuleBuilder class that can be used to build Alloy modules from code. It contains methods that are called from the parser and it allows you to build a complete module.

Most important, there is a strict separation between the API (mostly Java interfaces) and the implementations.

My unfinished & messy working branch is here: org.alloytools.alloy/org.alloytools.alloy.parser/src/main/antlr/Alloy.g4 at pkriens/api · pkriens/org.alloytools.alloy · GitHub