How would you make this git objects model more idiomatic?

brianhicks · August 24, 2020, 7:24pm

After trial and error, I’ve ended up with the following model of git object:

sig Object {}

sig Tree {
  children: some (Object + Tree)
}

sig Message {}

sig Commit {
  tree: one Tree,
  parent: lone Commit,
  message: lone Message
}

fact {
  -- Trees
  -- … can't contain themselves
  no t: Tree | t in t.^children
  -- … can't have the same children as another tree. If they do in real life,
  -- it's just the same tree (the storage is content-addressable)
  no disj t, u: Tree | t.children = u.children

  -------------------------------------

  -- Commits
  -- … can't form a cycle
  no c: Commit | c in c.^parent
  -- … can't be exactly the same as another commit. Same reason trees can't
  -- be identical.
  no disj c, d: Commit | c.tree = d.tree && c.parent = d.parent && c.message = d.message

  -------------------------------------

  -- Messages
  -- … are used by at least one commit. (Messages are really just a way to
  -- model that you can have a commit with or without a message.)
  all m: Message | some c: Commit | c.message = m
}

pred Default {}

run Default for 3

It produces helpful instances! But now that I’ve got this basically working, I want to stop and ask: what would an expert improve? I’m sure I’m going about things in less-than-great ways (e.g. maybe Object and Tree and Commit should all inherit from something so as to model that they’re all stored together in .git/objects on a filesystem?) and I’d love to hear y’all’s feedback!

(Next step for me: try using time in Electrum to model some simple operations and check that my model and the real git binary give the same result!)

DanielJackson · August 24, 2020, 11:43pm

Hi Brian, This seems like a really nice start. I think this is more than idiomatic enough to move on and start modeling more. I would ask yourself: what design aspects of Git do you have questions or uncertainties about? Or which design aspects are essential to Git working? For example, maybe you want to model how blobs are hashed and what properties the hashes should have.

One small comment about the model. This constraint

all m: Message | some c: Commit | c.message = m

seems to me to belong not the model proper but to a run command. There’s no reason to say that there can’t be messages that haven’t been assigned to commits. To make such messages disappear in the visualizer, you’d just select “hide unconnected nodes.”

brianhicks · August 25, 2020, 9:04pm

ah, thank you! I didn’t know about that option. I’ll definitely have to use that in the future

Topic		Replies	Views
No instances of Tree—why? models	3	593	August 13, 2020
How to say "these should not be equal" models	3	398	August 24, 2020
Trying to model a Filesystem Software Abstractions	8	584	September 20, 2020
EDIT: Idiomatic specification of graph with ordered adjacency? Questions	2	327	December 4, 2021
A few beginner questions Questions	4	204	November 6, 2023

How would you make this git objects model more idiomatic?

Related topics