Alloy 6 vs. TLA+

Seanny123 · April 3, 2023, 5:05pm

I know Alloy 6 is fairly new, but I was wondering if there are anecdotes of people who have used both Alloy 6 and TLA+? I assume that given they are both high-level formal specification languages which support temporal logic, I assume this is not an unreasonable comparison.

Specifically, I’m curious about the learning experience and the ease of use to modelling various domains. I did find @grayswandyr posted on lobster.rs. I also know Hillel Wayne has been trying to write a comparison, but currently has other priorities.

alexshirley · April 13, 2023, 9:36am

I’m still quite new to Alloy, but I’ve basically deep dived TLA+ for the last 2-3 months.
TLA+ really shines when it comes to state machines, but it’s very unstructured.

Meanwhile, I feel that Alloy does much better describing structures than state machines. If I’m going to architect a new database, I reach for Alloy.

If I’m working with complex state, and interacting systems (embedded development or complex GUI flow) I reach for TLA+

Seanny123 · April 13, 2023, 12:53pm

I’ve never heard of “complex GUI flow” as a use-case for TLA+. Do you have an example of this posted somewhere?

alexshirley · April 13, 2023, 1:09pm

Sadly no. It’s a monolith I’m working with. Suffice to say, there was a some business logic embedded in the GUI that should’ve been isolated from the presentation layer.

alcino · April 14, 2023, 8:26am

Although I mainly use Alloy, I teach both Alloy and TLA+.

I think TLA+ is particularly great for specifying and verifying concurrent and distributed algorithms that, as Alex mentions, can be described by state machines (or directly with PlusCal). I believe TLA+ was designed with that goal in mind, so that is not surprising. In general it is also faster than Alloy in verifying those algorithms. However, even in that particular application it has a couple of drawbacks when compared to Alloy: I think that Alloy is superior in the validation task, in particular due to the graphical visualisation of instances, run commands and the instance iteration operations (for example, it allows us to interactively exploring behaviours); I also believe Alloy is superior when you want to specify and verify multiple (complex) configurations of a protocol, for example verify a distributed spanning tree protocol for all arbitrary connected networks of a given size - although TLA+/TLC is more tailored to verify one configuration at a time (a particular network configuration, a parameter of the protocol specified with CONSTANTS), you can adapt your TLA+ specifications to verify all configurations at once, but doing that is a bit cumbersome and tends to makes the verification very slow. The latter is not an insignificant issue, since a protocol could be buggy for a particular configuration and it is very unlikely that the designer would know that configuration upfront.

But I think that Alloy is in general more flexible than TLA+: as Alex also mentions it also allows the specification and verification of structural requirements; it allows many different styles when specifying behaviour, not just the “state machine idiom” (although that is the most used one); and it is great for the earlier phases of design (before we get into verification), since it is very very good for validation and at handling partial specifications.

All the best,
Alcino

hwayne · May 5, 2023, 3:04am

Only a month late! Here’s my usual spiel:

Alloy comes from the Z line of “systems are basically giant relational algebra problems”, while TLA+ comes from the LTL line of “systems are basically giant temporal logic problems”, which leads to a lot of fundamental differences between them. They also had different design philosophies: Jackson wanted a language that was easy to build tooling for, Lamport wanted a language that was easy to express complex logic in.

One consequence of this is that Alloy is a much smaller language: you can fit the entire syntax on a single sheet. It’s also much easier in TLA+ to write a spec that you can’t model-check, while with Alloy you have to be actively trying. It’s also impossible to write an unbound model in Alloy, so you’re guaranteed that every spec is over a finite state space. Which gets rid of the common TLA+ runtime problem of “is this taking a long time because it’s a big state space or because it’s an infinite state space”.

Re tooling: Alloy converts to SAT, which makes model checking much, much faster. Specs that would take minutes to check in TLA+ take less than a second in Alloy. Being able to model-find as well as model-check is really nice, as is the ability to generate multiple counter/examples. The visualizer is really useful, too, especially for showing counterexamples to nontech folk. But the thing I honestly miss most when working in TLA+ is the REPL. It’s not the best REPL, but it’s a hell of a lot better than no REPL.

Okay, so drawbacks: Alloy6 added a notion of “time”. This expands the space that Alloy can do, but the syntax has footguns in it. If you forget to update a variable in TLA+, you get an error, if you forget to update in Alloy, it’ll just assign whatever value it wants to that variable. I asked one of the developers and apparently it’s not something they can easily address in the current semantics. Also, temporal specs take a lot longer to check than static specs. The more “timelike” the problem gets, the worse Alloy becomes at handling it.

Alloy also doesn’t have composable collection types. You can model “a queue of messages”, but you can’t easily model “a queue of messages for each server”. It’s possible but you have to be clever, whereas in TLA+ it’s straightforward.

Less fundamental but still a big problem I keep having: no good iteration or recursion. Transitive closure is pretty nice but it’s not powerful enough. For example:

     sig node {edge: set Node}

Is N1 connected to N2? Easy, just do N1 in N2^.edge. How many nodes is the shortest path? That’s much harder.

Which one is more appropriate to use really depends on your problem. As a very, very, VERY rough rule of thumb: TLA+ is best you’re most interested in the dynamics of a system: the algorithms, the concurrency, etc. Alloy is best when you’re most interested in the statics of a system: the requirements, the data relationships, the ontology, etc. My standard demos for showcasing TLA+ are concurrent transfers and trading platforms, while my standard demos for Alloy are access policies and itemized billing. You can use each in the other’s niche, too (see Zave’s work on Chord), so it’s not a strict division.

hwayne · May 5, 2023, 3:04am

I’ll make getting the actual comparison post done a priority now

alcino · May 5, 2023, 5:49am

Hi,

The (symbolic) nuXmv backend of Alloy 6 is surprisingly good at checking invariants (with unbounded model checking). For example, recently I was specifying the original Echo distributed spanning tree protocol from Chang, and Alloy 6 was way faster at checking the key correctness property for ALL connected networks with up to 5 nodes than TLA+/TLC was at checking ONE single configuration with a complete network with 5 nodes (even with symmetry on). I agree that in general TLA+/TLC is faster for distributed and concurrent algorithms, but that is not always the case (I think that when we have a lot of non-determinism the symbolic model checking of Alloy 6 tends to be better, but that is just a wild guess).

With the seq “multiplicity” that particular example is also fairly easy to model in Alloy 6 (syntax for the operations is not pretty though, neither efficiency), but I agree that the larger collection of types in TLA+ makes it easier for most users to specify the data structures in their systems, since they resemble more the kinds of types that are available in programming languages. The “everything is a relation” motto of Alloy forces you into a very abstract modelling mindset, that might be very puzzling at first.

Best,
Alcino

hwayne · May 5, 2023, 3:45pm

Oh that’s interesting! Would you be willing to share both specs?

alcino · May 5, 2023, 5:44pm

Sure, I’ll send you the specs by email.

pfeodrippe · May 11, 2023, 3:30am

Oww awesome, could I have them as well? I’m the maintainer of a Clojure library that wraps TLC (Recife), but I would like to have Alloy 6 available as well (as Alloy is built with Java, like TLC). Maybe having a TLA+ operator calling Alloy, who knows lol

alcino · May 11, 2023, 10:42am

Sure, I’ll email you.

ab7100 · May 27, 2023, 8:29am

I’m new to formal methods, but I’m on fire for the notion, and benefits of “thinking above the code”. So far I have played with TLA+, only really used Alloy, and my impressions so far (still very new and ignorant) is that TLA+ is like the “expert mode” and that I could probably write a spec for just about anything with it, but it has a higher learning curve. Also, there seem to be many more learning resources available for TLA+ (probably just due to maturity I’m guessing). PlusCal looks like an amazingly helpful tool to help a noob like me ease into this.

Alloy’s approachable syntax and the model explorer seem absolutely awesome! Some parts of the Alloy docs I’ve been rereading because I’m so new to this higher level of abstraction. The temporal logic seems to make sense to me pretty quickly, it’s the relational logic that confused me a little bit at first. I really like Alloy a lot. (Please forgive me if I’m misusing some of this new vocabulary)

I’m curious, does anyone here have any experience with P? It seems interesting the whole idea of representing your system as communicating state machines.

In any case, I just took a new job working for a utility provider. I work with a bunch of Electrical Engineers and Power Systems Engineers so luckily, they are pretty supportive when it comes to improving quality, verifiability, and modeling. I’m extremely confident that Alloy 6 and/or TLA+ would be extremely helpful to our team. I’m very new to power systems as well, and we’re trying to make the power grid “smarter” by having the various devices involved in a microgrid send data about the system/circuit state through message broker and our Rust code subscribes to the topics, and then uses a genetic algorithm to “solve” the circuit for VVO/CVR (Variable Voltage Optimization/Conservation Voltage Reduction), and issues commands to the hardware to adjust the voltage to ensure we stay within the allotted range and don’t fry people’s stuff.

Given that this is a set of complex systems with a lot of concurrent behavior, extremely expensive equipment, and operators could get hurt or people could lose power, I’ve been shocked at the lack of documentation and tests of the code base my little team is inheriting.
This is an excellent example of verifiable correctness being kinda important, so I was pretty surprised when I asked around and the other software folk were not only unaware of formal methods, but were confused and almost hostile at the idea (perhaps I did not communicate effectively).

It has taken so long for my team to gain background context on the problem domain and to become familiar with the massive and complex code base we’ve inherited.

It never ceases to amaze me how ubiquitous it is in the industry to just cut corners (little/no or low quality tests, low/no diversity of testing strategies, poor/non-existent documentation, etc.). Many business just want to get to market first, and say they value quality, but any suggested practices/processes to improve quality/verifiability are met with impatience or annoyance.

Side Note: the previous team of contractors (and original authors of the code base) are being let go because they blew up a battery/capacitor bank that probably cost a couple years of my salary.

TLDR;
I’ve been hungry to learn some new tactics and tools to help the software teams I work on level up the quality of our deliverables and I think Alloy 6 and/or TLA+ is exactly what I’ve been looking for. I’ve got some minimal buy-in and I’m going to create a specification for our test grid, the devices on it, and ideally the voltage optimization algorithm and get the SMEs to help me ensure it’s correct and demonstrate to the software teams the power and benefits of this practice!

peter.kriens · May 27, 2023, 10:34am

I think I fully agree with your assessment of our industry and it’s practices. I’m on a life long quest to find ways to write better software better. And puzzled by how such an important industry can sometimes be so shoddy.

That said, it is hard. I fell in love with Alloy about 5 years ago. I really think it contains the possibility to really improve the design process. However, I do not think it is ready for the kind of industrial applications you talk about. It might get there, and I’ve been looking for customers to fund the development of the missing pieces. For I do think there is an impedance mismatch between the formal tools and the industry tools.

I think the industry mostly needs a strong specification language that tightly integrates with the continuous integration pipeline. Academia seems mostly interested in how clever the tool is in verifying assertions.

The most brilliant part of Alloy is that it can create instances of models but because of that numbers are severely crippled and model size sometimes a problem.

I do believe that the impedance mismatch can be bridged but it does require a lot of work and I see nobody really working on this bridge.

grayswandyr · May 28, 2023, 9:13am

@jbrubru and I supervised a master internship last year, assessing P (essentially an actor language) and comparing it to some work we had done in Alloy several years ago. As far as we were able to see, verification in P was difficult to set up and P being asynchronous, you need to implement synchronization yourself when in need (it’s easier the other way round).

ab7100 · May 29, 2023, 2:39am

Thank you so much for y’all’s feedback. It’s helping me get my bearings, and helping me correct my understanding of what tools are suited for which scenarios a bit too.

My understanding so far is that (and please correct me if I’m wrong) TLA+ is probably the best (and maybe only) of these tools that is viable for industrial (not just small systems or academic demonstrations) use. I’ve heard about some of the white papers or descriptions about TLA+ being used at AWS and Microsoft for large and important systems or algorithms.

Of course, it’s difficult to provide y’all that know more than me with enough context to ask this, and I don’t want to impose on your time too much, but I’m getting the notion that of the existing tools I’m aware of (and a bit familiar with), TLA+ would be the most appropriate tool to try to use at my job. Would anyone here say this sounds correct? Or would anyone here disagree strongly?

I’ve been given a chance to do a spike and demonstrate what might be possible. I have one teammate who has fully bought-in to the idea and wants to learn this skillet too. I’m considering either of the following 2:

I could use Alloy for the ontology of the devices on the microgrid (and it would help everyone to visualize this, because not all of us can read circuit/electrical diagrams) but the genetic algorithm and any progression of state transitions of the circuit would best be handled with TLA+
Use TLA+ all the way

Please forgive me for advice-seeking here. Does anyone have any feedback or suggestions regarding my alternatives? Or is there a more appropriate place to request such feedback?

ab7100 · May 29, 2023, 2:48am

Thank you for your reply. I actually believe strongly in contributing to projects I believe in (even if it’s just $10/month or something). I know that the Linux Foundation just created the TLA+ foundation to help fuel improvement and adoption. I’ll check Open Collective as well, but I’m curious if there are any established ways to donate a little bit each month towards the development of Alloy or TLA+?

peter.kriens · May 29, 2023, 6:59am

We do not have an official way to donate and I am not sure how we could spend that kind of money.

What imho is really needed are research projects on actual industrial problems. We miss the bridges and they can only be made under the pressure of real world pressure.

ab7100 · May 29, 2023, 1:34pm

Understood sir, and that makes sense. One of our newest power systems engineer got her PhD (I’m not sure how long ago). But I know she still has active connections in academia. I can inquire with the company I’m contracting for if they have any sort of internship/study program.

This newest engineer seems the most passionate about sharing understanding, increasing quality, and following suggested industry practices (obviously especially on the power systems side, as state and federal regulations must be adhered to), and I’ve talked to her a couple times about trying to extend some of that rigor to the software side.

I’d be absolutely willing to request a case study (our company has published some before, just not usually as software focused I think). Would something like that be more helpful?

That was one thing that kind of bummed me out was on alloytools.org/applications page, with the exception of Google Scholar searches, I didn’t find anything with Alloy after the year 2015 or so. I would love to see some industrial uses that included Alloy 6.

hab25 · July 7, 2023, 7:03pm

Will this be posted to https://www.hillelwayne.com?

Topic		Replies	Views
A couple of questions after reading "Formally specifying UIs" Questions	8	79	March 18, 2025
Announcing Alloy 6! Alloy 6	7	1414	June 16, 2022
What documentation / learning material would you like to see?	9	601	August 19, 2023
New book: Practical Alloy -- A hands-on guide to formal software design Alloy 6	4	118	February 26, 2025
Alloy 6 Examples/Tutorials?	5	1113	April 3, 2023

Alloy 6 vs. TLA+

Related topics