After trial and error, I’ve ended up with the following model of git object:
sig Object {}
sig Tree {
children: some (Object + Tree)
}
sig Message {}
sig Commit {
tree: one Tree,
parent: lone Commit,
message: lone Message
}
fact {
-- Trees
-- … can't contain themselves
no t: Tree | t in t.^children
-- … can't have the same children as another tree. If they do in real life,
-- it's just the same tree (the storage is content-addressable)
no disj t, u: Tree | t.children = u.children
-------------------------------------
-- Commits
-- … can't form a cycle
no c: Commit | c in c.^parent
-- … can't be exactly the same as another commit. Same reason trees can't
-- be identical.
no disj c, d: Commit | c.tree = d.tree && c.parent = d.parent && c.message = d.message
-------------------------------------
-- Messages
-- … are used by at least one commit. (Messages are really just a way to
-- model that you can have a commit with or without a message.)
all m: Message | some c: Commit | c.message = m
}
pred Default {}
run Default for 3
It produces helpful instances! But now that I’ve got this basically working, I want to stop and ask: what would an expert improve? I’m sure I’m going about things in less-than-great ways (e.g. maybe Object
and Tree
and Commit
should all inherit from something so as to model that they’re all stored together in .git/objects
on a filesystem?) and I’d love to hear y’all’s feedback!
(Next step for me: try using time in Electrum to model some simple operations and check that my model and the real git
binary give the same result!)