The use of Alloy and LLM's

peter.kriens · March 12, 2024, 9:15am

I’ve been heavily engaged with GPT-4 recently. It is fascinating how much it knows but trying to use it in a real setting (controlling the Eclipse IDE) it is a nightmare, it does some things surprisingly well but then makes the most basic mistakes. Thank god there is git to restore the workspace previous state.

Interestingly, it is quite good at correcting a mistake after a proper diagnosis or a counter example. It often feels like an open loop system where the human is supposed to be the feedback provider. Sadly, this requires extensive knowledge by the ‘clever Hans’ (as this mechanism is called).

I wonder if we could use Alloy as the ‘clever Hans’? If the LLM generates an Alloy program we could generate counter examples and feed these back.

GPT-4 can generate Alloy models but they are usually not very clever or good but I think with a thorough description of the Alloy language and maybe extensively training an LLM?

Clearly the syntax does not have to be Alloy, the idea is to be able to generate counter examples so the LLM is confronted with the consequences of its proposals.

Is anybody else thinking along these lines?

awwaiid · March 15, 2024, 9:53pm

Hi! This is exactly what I’m going to be working on starting in a week – I’ve enrolled at The Recurse Center for a 6-week (March 25 - May 3) professional sabbatical. I have lots of ideas, but generally I’m playing in this space of Formal Models integrated into day-to-day dev settings, bridged by LLM. I generally work in Rails applications, and have good vibes with Alloy, so I’m starting with Alloy+LLM+Rails.

Pre-gaming in research a bit, this is very aligned with the work that Sullivan is doing at SCOPE Lab, and I’ve run into other Rails+Alloy projects too, such as Near’s earlier work on Rubicon, a (defunct) ruby RSpec DSL that integrates Alloy into working test suites.

I’m imagining taking a working codebase (say a Rails app), extracting one or more Alloy models using LLM stuff, maybe also taking some direction of what we want to explore / prove / disprove, and then generate back unit tests (say Rspec tests) that pass/fail as expected. In Rails this might be focused on the models, API interactions, existing test functionality, etc – so I’m hoping that there is enough consistent structure there for the cleverness of Hans to turn into something practical. Maybe something that can get into the CI/CD pipeline. Dunno.

One of my co-workers suggested focusing on cancan access control as an even more focused subset, which makes sense, and there are likely other areas. But I want to swing for the stars.

I welcome ideas!!! Since I’ll be working on this more or less full time, I’m also up for direct collaboration or brainstorming on zoom and whatnot. I’ll be in Brooklyn for the duration and would be happy to buy someone an IRL coffee if they wanna talk Alloy+LLM+Rails or some overlap therein

peter.kriens · March 18, 2024, 10:00am

This sounds incredibly interesting!

You’re moving in a direction, I think that I also tried to pursue in Java. Some of my thoughts are in this video: https://www.youtube.com/watch?v=_j3SdIGqYG8

This post was, however, more in the direction of letting the LLM make a proposal/plan in Alloy and then creating an instance and feed it back.

Anyway, keep us posted, or keep me posted (peter.kriens@aqute.biz), very interested in these efforts.

awwaiid · March 28, 2024, 2:08pm

What a delightful video @peter.kriens! Yes, this is definitely the same area that I’m exploring here. I’m doing exploration on what interactions can be very practical, and on what layer (cli, embedded in test suite, ci/cd).

Once I’ve got a few ideas I’d love to chat live in the next few weeks – I’ll email to coordinate but will also post here a public zoom that anyone interested will be welcome to join as an ad-hoc demo and discussion group.

Anyone else working in this space?

ioyou · May 4, 2024, 10:42am

@awwaiid, it’s been a while, and I was wondering if you have any feedback based on your experience at RC =)
I came across this article, which seems to give a taste of the possibilities of repairing (or generating) Alloy specs. I want to apply that to TLA+ in the future.

NganTran-0017 · September 30, 2024, 2:58pm

Hi,

I am working in the same direction that @peter.kriens mentioned above.
It would be great if we can connect and talk about it via Zoom.

Ps: Thank you @peter.kriens for sharing the video!
Thanks,
Natalie Tran

Topic		Replies	Views
GuessTheSpec #1	1	348	September 21, 2020
Test cases for Alloy Experimental	11	673	March 31, 2022
Getting started Site Feedback	1	302	August 13, 2020
About the Questions category Questions	0	279	August 26, 2020
What documentation / learning material would you like to see?	9	629	August 19, 2023

The use of Alloy and LLM's

Related topics