Alloy 6 vs. TLA+

I know Alloy 6 is fairly new, but I was wondering if there are anecdotes of people who have used both Alloy 6 and TLA+? I assume that given they are both high-level formal specification languages which support temporal logic, I assume this is not an unreasonable comparison.

Specifically, I’m curious about the learning experience and the ease of use to modelling various domains. I did find @grayswandyr posted on I also know Hillel Wayne has been trying to write a comparison, but currently has other priorities.

1 Like

I’m still quite new to Alloy, but I’ve basically deep dived TLA+ for the last 2-3 months.
TLA+ really shines when it comes to state machines, but it’s very unstructured.

Meanwhile, I feel that Alloy does much better describing structures than state machines. If I’m going to architect a new database, I reach for Alloy.

If I’m working with complex state, and interacting systems (embedded development or complex GUI flow) I reach for TLA+

1 Like

I’ve never heard of “complex GUI flow” as a use-case for TLA+. Do you have an example of this posted somewhere?

Sadly no. It’s a monolith I’m working with. Suffice to say, there was a some business logic embedded in the GUI that should’ve been isolated from the presentation layer.

Although I mainly use Alloy, I teach both Alloy and TLA+.

I think TLA+ is particularly great for specifying and verifying concurrent and distributed algorithms that, as Alex mentions, can be described by state machines (or directly with PlusCal). I believe TLA+ was designed with that goal in mind, so that is not surprising. In general it is also faster than Alloy in verifying those algorithms. However, even in that particular application it has a couple of drawbacks when compared to Alloy: I think that Alloy is superior in the validation task, in particular due to the graphical visualisation of instances, run commands and the instance iteration operations (for example, it allows us to interactively exploring behaviours); I also believe Alloy is superior when you want to specify and verify multiple (complex) configurations of a protocol, for example verify a distributed spanning tree protocol for all arbitrary connected networks of a given size - although TLA+/TLC is more tailored to verify one configuration at a time (a particular network configuration, a parameter of the protocol specified with CONSTANTS), you can adapt your TLA+ specifications to verify all configurations at once, but doing that is a bit cumbersome and tends to makes the verification very slow. The latter is not an insignificant issue, since a protocol could be buggy for a particular configuration and it is very unlikely that the designer would know that configuration upfront.

But I think that Alloy is in general more flexible than TLA+: as Alex also mentions it also allows the specification and verification of structural requirements; it allows many different styles when specifying behaviour, not just the “state machine idiom” (although that is the most used one); and it is great for the earlier phases of design (before we get into verification), since it is very very good for validation and at handling partial specifications.

All the best,


Only a month late! Here’s my usual spiel:

Alloy comes from the Z line of “systems are basically giant relational algebra problems”, while TLA+ comes from the LTL line of “systems are basically giant temporal logic problems”, which leads to a lot of fundamental differences between them. They also had different design philosophies: Jackson wanted a language that was easy to build tooling for, Lamport wanted a language that was easy to express complex logic in.

One consequence of this is that Alloy is a much smaller language: you can fit the entire syntax on a single sheet. It’s also much easier in TLA+ to write a spec that you can’t model-check, while with Alloy you have to be actively trying. It’s also impossible to write an unbound model in Alloy, so you’re guaranteed that every spec is over a finite state space. Which gets rid of the common TLA+ runtime problem of “is this taking a long time because it’s a big state space or because it’s an infinite state space”.

Re tooling: Alloy converts to SAT, which makes model checking much, much faster. Specs that would take minutes to check in TLA+ take less than a second in Alloy. Being able to model-find as well as model-check is really nice, as is the ability to generate multiple counter/examples. The visualizer is really useful, too, especially for showing counterexamples to nontech folk. But the thing I honestly miss most when working in TLA+ is the REPL. It’s not the best REPL, but it’s a hell of a lot better than no REPL.

Okay, so drawbacks: Alloy6 added a notion of “time”. This expands the space that Alloy can do, but the syntax has footguns in it. If you forget to update a variable in TLA+, you get an error, if you forget to update in Alloy, it’ll just assign whatever value it wants to that variable. I asked one of the developers and apparently it’s not something they can easily address in the current semantics. Also, temporal specs take a lot longer to check than static specs. The more “timelike” the problem gets, the worse Alloy becomes at handling it.

Alloy also doesn’t have composable collection types. You can model “a queue of messages”, but you can’t easily model “a queue of messages for each server”. It’s possible but you have to be clever, whereas in TLA+ it’s straightforward.

Less fundamental but still a big problem I keep having: no good iteration or recursion. Transitive closure is pretty nice but it’s not powerful enough. For example:

     sig node {edge: set Node}

Is N1 connected to N2? Easy, just do N1 in N2^.edge. How many nodes is the shortest path? That’s much harder.

Which one is more appropriate to use really depends on your problem. As a very, very, VERY rough rule of thumb: TLA+ is best you’re most interested in the dynamics of a system: the algorithms, the concurrency, etc. Alloy is best when you’re most interested in the statics of a system: the requirements, the data relationships, the ontology, etc. My standard demos for showcasing TLA+ are concurrent transfers and trading platforms, while my standard demos for Alloy are access policies and itemized billing. You can use each in the other’s niche, too (see Zave’s work on Chord), so it’s not a strict division.

1 Like

I’ll make getting the actual comparison post done a priority now


The (symbolic) nuXmv backend of Alloy 6 is surprisingly good at checking invariants (with unbounded model checking). For example, recently I was specifying the original Echo distributed spanning tree protocol from Chang, and Alloy 6 was way faster at checking the key correctness property for ALL connected networks with up to 5 nodes than TLA+/TLC was at checking ONE single configuration with a complete network with 5 nodes (even with symmetry on). I agree that in general TLA+/TLC is faster for distributed and concurrent algorithms, but that is not always the case (I think that when we have a lot of non-determinism the symbolic model checking of Alloy 6 tends to be better, but that is just a wild guess).

With the seq “multiplicity” that particular example is also fairly easy to model in Alloy 6 (syntax for the operations is not pretty though, neither efficiency), but I agree that the larger collection of types in TLA+ makes it easier for most users to specify the data structures in their systems, since they resemble more the kinds of types that are available in programming languages. The “everything is a relation” motto of Alloy forces you into a very abstract modelling mindset, that might be very puzzling at first.



Oh that’s interesting! Would you be willing to share both specs?

Sure, I’ll send you the specs by email.

Oww awesome, could I have them as well? I’m the maintainer of a Clojure library that wraps TLC (Recife), but I would like to have Alloy 6 available as well (as Alloy is built with Java, like TLC). Maybe having a TLA+ operator calling Alloy, who knows lol

Sure, I’ll email you.

1 Like

I’m new to formal methods, but I’m on fire for the notion, and benefits of “thinking above the code”. So far I have played with TLA+, only really used Alloy, and my impressions so far (still very new and ignorant) is that TLA+ is like the “expert mode” and that I could probably write a spec for just about anything with it, but it has a higher learning curve. Also, there seem to be many more learning resources available for TLA+ (probably just due to maturity I’m guessing). PlusCal looks like an amazingly helpful tool to help a noob like me ease into this.

Alloy’s approachable syntax and the model explorer seem absolutely awesome! Some parts of the Alloy docs I’ve been rereading because I’m so new to this higher level of abstraction. The temporal logic seems to make sense to me pretty quickly, it’s the relational logic that confused me a little bit at first. I really like Alloy a lot. (Please forgive me if I’m misusing some of this new vocabulary)

I’m curious, does anyone here have any experience with P? It seems interesting the whole idea of representing your system as communicating state machines.

In any case, I just took a new job working for a utility provider. I work with a bunch of Electrical Engineers and Power Systems Engineers so luckily, they are pretty supportive when it comes to improving quality, verifiability, and modeling. I’m extremely confident that Alloy 6 and/or TLA+ would be extremely helpful to our team. I’m very new to power systems as well, and we’re trying to make the power grid “smarter” by having the various devices involved in a microgrid send data about the system/circuit state through message broker and our Rust code subscribes to the topics, and then uses a genetic algorithm to “solve” the circuit for VVO/CVR (Variable Voltage Optimization/Conservation Voltage Reduction), and issues commands to the hardware to adjust the voltage to ensure we stay within the allotted range and don’t fry people’s stuff.

Given that this is a set of complex systems with a lot of concurrent behavior, extremely expensive equipment, and operators could get hurt or people could lose power, I’ve been shocked at the lack of documentation and tests of the code base my little team is inheriting.
This is an excellent example of verifiable correctness being kinda important, so I was pretty surprised when I asked around and the other software folk were not only unaware of formal methods, but were confused and almost hostile at the idea (perhaps I did not communicate effectively).

It has taken so long for my team to gain background context on the problem domain and to become familiar with the massive and complex code base we’ve inherited.

It never ceases to amaze me how ubiquitous it is in the industry to just cut corners (little/no or low quality tests, low/no diversity of testing strategies, poor/non-existent documentation, etc.). Many business just want to get to market first, and say they value quality, but any suggested practices/processes to improve quality/verifiability are met with impatience or annoyance.

Side Note: the previous team of contractors (and original authors of the code base) are being let go because they blew up a battery/capacitor bank that probably cost a couple years of my salary.

I’ve been hungry to learn some new tactics and tools to help the software teams I work on level up the quality of our deliverables and I think Alloy 6 and/or TLA+ is exactly what I’ve been looking for. I’ve got some minimal buy-in and I’m going to create a specification for our test grid, the devices on it, and ideally the voltage optimization algorithm and get the SMEs to help me ensure it’s correct and demonstrate to the software teams the power and benefits of this practice!

I think I fully agree with your assessment of our industry and it’s practices. I’m on a life long quest to find ways to write better software better. And puzzled by how such an important industry can sometimes be so shoddy.

That said, it is hard. I fell in love with Alloy about 5 years ago. I really think it contains the possibility to really improve the design process. However, I do not think it is ready for the kind of industrial applications you talk about. It might get there, and I’ve been looking for customers to fund the development of the missing pieces. For I do think there is an impedance mismatch between the formal tools and the industry tools.

I think the industry mostly needs a strong specification language that tightly integrates with the continuous integration pipeline. Academia seems mostly interested in how clever the tool is in verifying assertions.

The most brilliant part of Alloy is that it can create instances of models but because of that numbers are severely crippled and model size sometimes a problem.

I do believe that the impedance mismatch can be bridged but it does require a lot of work and I see nobody really working on this bridge.