Research Notes

Table of Contents

Type declarations

When translating a type declaration into separation logic, we have to create, besides the type definition, a representation predicate that receives a program variable and lifts it into a logical representation. This logical representation is defined using the model field in the type specification.

Unnamed Model Fields

One of the changes we made to the Gospel typechecker was allowing model fields to be unnamed. For example, if we wish to specify a queue in current gospel whose logical model is a sequence, we must write as such.

type t
(*@ model view : int sequence *)

val push : t -> int -> unit
(*@ push q x
    ensures q.view = old q.view ++ [x] *)

Although the model of the queue is the only thing that interests us in this specification (and indeed, most specifications) we still have to access it via the field view. This can not only be cumbersome but also make specifications more verbose than they need be. Although the example above is still valid in Gospel, we have developed an alternate syntax for cases where types only have a single model field.

type t
(*@ model : int sequence *)

val push : t -> int -> unit
(*@ push q x
    ensures q = old q ++ [x] *)

In the second code snippet, q in the specification has the same type as its model field. This way, the specification becomes a bit easier to write and read.

Multiple Model Fields

When creating the representation predicate for a type with named model fields, we have the option between creating a predicate with multiple fields for the logical values or creating a record type with each field described in the specification. I believe that the sanest way to do this is by using ghost record type. This is because multiple fields can make the translation more illegible, as well as more difficult to execute, seeing as I would have to replace all occurrences of model field accesses as well as manage potential problems with clashing names.

Example (assuming x is of type t with two model fields y and z) :

(*@ f x y z
    ensures x.y + y = z *)

would become, during translation

{}(f(x, y, z)){P(x, y, z) * y + y = z}

While typing, I assume just one type and, in the specification, handle the model fields as ghost fields (which is already the default behaviour). While translating to separation logic, I create two types, one for the program type and another for the model.

Naming problems

The tricky question to solve however is what names to give these types? I think the easiest solution is to give the logical type the same name as the OCaml type and give the program type the same name, but with an underscore at the beginning. This way, we don't have to change anything in the typed specification.

Ephemeral and Pure Annotations

It is also possible to not give the type a model field and instead opt to give an ephemeral annotation as such

type t
(*@ ephemeral *)

This way, we can mark the type as mutable without giving a precise definition of its logical representation. If the type has no annotations, then we assume it to be pure and its model is a reflection of the OCaml type.

Representation Predicates

As stated previously, when we translate a type declaration, we also need to generate an appropriate representation predicate. How we do so depends on how the type is defined.

Abstract Types

Since we cannot give a representation predicate to a type that is undefined in the program, we simply create its signature. This predicate will have two arguments, the program value and the logical representation, that is, the type defined in its model clause.

  • If the type is pure, then the program value is the same as the OCaml type. In the case where the model is a reflection of the OCaml type as described in Section Ephemeral and Pure Annotations, no representation predicate is generated
  • If the type is mutable, then the program value is of type loc.

Concrete Types

TBD

Note on naming

Generally, the generated representation predicate has the same name as the original type, but with the first letter capitalized and an underscore at the start. However, when the type's name is t, the representation predicate gets the same name as the name of the module, so as to make the generated Separation Logic triples simpler to read.

Translation of Type declarations

type t
(*@ model : int *)

Translates into

Type t
Predicate _T (target : t) (model : int)
type t
(*@ mutable model : int *)

Translates into

Predicate _T (target : loc) (model : int)
type t
(*@ model x : int
    model y : int *)

Translates into

Type t = {x : int; y : int}
Type _t

Predicate _T (target : t) (model : _t)
type t
(*@ mutable model x : int
    model y : int
*)

Translates into

Type t = {x : int; y : int}
Predicate _T (target : loc) (model : t)

Implementation

  1. Check if the type has a specification. If so, retrieve the model, if not, assume the model is a reflection of the OCaml type.
  2. We must check if the type has multiple model fields. If so, we create an optional value that has the definition of a record type with all these fields. This type is always marked is non-mutable. The name of this type is the same as the name of the original type.
  3. We create a definition of the type. The name of this type depends if we created a record in the previous step. If so, the name is the old name with an underscore at the start. Otherwise, the name is unchanged. For now, we only support abstract types.
  4. We create the definition for the representation predicate. If the type is mutable, the first argument is of type loc, otherwise, the first argument is the program type. The next argument is the type's model, as was defined previously.

Limitations

  • When translated into CFML, record fields with names that have already been defined don't work (Low priority).
  • No support for polymorphic types with recursive ownership.
  • No support for concrete type definitions. This can be quite tricky since in these cases we would need to define an explicit representation predicate in Gospel, which is not yet possible.
    • Additionally, we must consider the relationship between invariants and representation predicates.

Value descriptions

When translating a value description and its specification, we must take proper care to ensure that all the ownership conditions are properly generated and are semantically equivalent to their Gospel counterparts.

Lenses

To describe ownership of values, we use what are called Lenses; lightweight notation to encode representation predicates. For example, let us assume once again a queue whose logical model is a sequence of integers:

type t
(*@ mutable model : int sequence *)

val push : t -> int -> unit
(*@ f q x
    modifies q
    ensures q = q ++ [x]*)

As I stated in the previous section, the variable q has type int sequence in the specification. However, in the OCaml interface, it has type t. To make this transition, we require a representation predicate (in Gospel's case, a lens) that lifts the program value into its logical value. In this case, we could write

val push : t -> int -> unit
(*@ f q x
    modifies q @ t
    ensures ... *)

The modifies clause states that we receive ownership of the OCaml variable q using the lens t and lift it into the logical type stated in t's model, that is, a sequence of integers. When we use the lens with the same name as the original type (which we will refer to as the default lens), we say that we have full ownership of that value. Additionally, the modifies clause also states that we return the same ownership of q. Given these semantics, we could translate this specification into the following Separation Logic triple (in CFML syntax for simplicity's sake).

forall (prog_q : loc) (q : int list) (x : int)
PRE (prog_q ~> Queue q)
POSTUNIT (prog_q ~> Queue (q ++ [x]))

Important to note that lenses are not exclusive used in specifications as we see in the Section on Logical Quantifiers, however, we will mostly focus on this use case as it is the most common as well as the simplest.

Ownership clauses

To describe ownership with lenses, we can use one of the following clauses:

  • consumes x @ t: states that the function receives ownership of a value x as described by the representation predicate t.
  • produces x @ t: states that the function returns ownership of a value x as described by the representation predicate t.
  • modifies x @ t: states that the function receives and returns ownership of a value x as described by the representation predicate t. Note that Gospel treats this clause as syntactic sugar for a consumes and produces clause.
  • preserves x @ t: states that the function receives and returns ownership of a value x as described by the representation predicate t. Note that this is syntactic sugar for a produces and consumes clause coupled with an ensures clause stating that x is unchanged.

If no ownership clause is defined for a function argument, then Gospel assumes a preserves clause with full ownership. In the case of return values, we assume a produces with full ownership.

Available Lenses

For now, besides the default lens, we only have access to two lenses:

  • The val lens, that gives us no ownership of a value
  • The loc lens, that gives us no ownership of a value, which can only be used for mutable values. The reason why there are two different lenses that have similar purposes is to specify functions that feature physical equality. For example, due to how physical equality is defined in OCaml, it may be desirable to define it only for the case when it receives two memory locations. To do this, we can state:

    val ph_eq : 'a -> 'a -> bool
    (*@ ph_eq x y
        preserves x @ loc, y @ loc
        ensures x = y *)
    

    This way, this function can only be called with values with which we can apply the lens loc, in other words, mutable values.

Limitations

  • No support for polymorphic lenses, in other words, higher order representation predicates.
  • No support for group ownership.

Translations of Lenses

This section will showcase some simple examples of how lenses are translated into Separation Logic. To showcase, I will show an example of a stack module, its Gospel specification and its CFML translation.

type 'a t
(*@ mutable model : 'a Sequence.t *)

Translates into

Parameter Stack :
forall A {Ih : Inhab A} {EA : Enc A},
sequence A -> loc -> hprop

A few details about this translation:

  • The type definition is omitted in the final Coq code and only the representation predicate remains. This is because the type is mutable which is not allowed in our Separation Logic encoding.
  • When defining the representation predicate, we also state that the type argument A is inhabited, an assumption that Gospel makes. Additionally, we also state that the type can be encoded in OCaml. The first assumption is made for all type arguments whereas the later is only made for type arguments of OCaml functions.
  • The first argument of the predicate is the logical model, the second argument is the CFML type for memory locations. Its return is a separation logic proposition.
  • The name of the representation predicate (Stack) is automatically inferred from the module name.
val create : unit -> 'a t
(*@ q = create ()
    ensures q = empty *)
Parameter _create : CFML.Semantics.val.

Parameter _create_spec:
  forall (A : Type) (Ih : Inhab a) (EA : Enc a),
  SPEC_PURE (_create tt)
    POST (fun _prog_q : loc => _prog_q ~> Stack_cfml empty)

A few more details

  • Since the original specification does not have any ownership clauses that reference the return value, we assume produces q @ t.
  • Just like in the original program, we create an abstract function definition as well as a triple.
  • The create specification returns a variable _prog_q which is not present in the original specification. This is because the q return value represents the logical model of the stack, whereas _prog_q represents its memory location, which is not explicitly defined in the Gospel side.
  • Interestingly, the q variable is not present in the generated specification. This is because in this case the Gospel specification can be captured without it, although there are cases where it must be defined via existential quantification.
val is_empty : 'a t -> bool
(*@ b = is_empty q
    ensures b <-> q = empty *)

Translates into:

Parameter _is_empty : CFML.Semantics.val.

Parameter _is_empty_spec :
  forall (A : Type) (IH : Inhab a) (EA : Enc a)
    (_prog_q : loc) (q : sequence a),
  SPEC (_is_empty _prog_q)
  PRE _prog_q ~> Stack_cfml q
  POST (fun _prog_b : bool =>
        _prog_q ~> Stack_cfml q \* _prog_b ~> Bool (q = empty))

More details:

  • Since the Gospel specification does not have any ownership clauses regarding the variable q, we assume preserves @ t.
  • Since the q variable is unmodified and its ownership is preserved, we use the representation predicate in both the pre and post conditions to claim ownership.
  • The Bool representation predicate allows us to jump from the OCaml bool type to the Coq Prop type.
val push : 'a t -> 'a -> unit
(*@ push p x
    modifies p
    ensures p = cons x (old p) *)

Translates into:

Parameter _push : CFML.Semantics.val.

Parameter _push_spec :
  forall (a : Type) (aIh : Inhab a) (Ea : Enc a)
    (_prog_p : loc) (p : sequence a) (x : a),
  SPEC (_push _prog_p x)
  PRE _prog_p ~> Stack_cfml p
  POSTUNIT _prog_p ~> Stack_cfml (cons x p)
val pop : 'a t -> 'a
(*@ r = pop p
    modifies p
    raises Not_found
    checks p <> empty
    ensures (old p) = cons r p *)

Translates into:

Parameter _pop : CFML.Semantics.val.

Parameter _pop_spec :
  forall (a : Type) (aIh : Inhab a) (Ea : Enc a)
    (_prog_p : loc) (p : sequence a),
  p <> empty ->
  SPEC (_pop _prog_p)
  PRE _prog_p ~> Stack_cfml p
  POST (fun r : a =>
        \exists _p_ : sequence a,
          _prog_p ~> Stack_cfml _p_ \* \[p = cons r _p_])

Some more details:

  • In the postcondition for this spec, we must explicitly lift the pointer _prog_p using an existentially quantified variable.
  • In the original Gospel spec, we have a raises clause stating that the function raises the Not_found exception, as well as a raises clause stating that if the stack is not empty, no exception is raised. Since CFML does not support exceptions, we state that the generated CFML spec is valid if the stack is not empty.

Logical Quantifiers

If we wish to state a property universal to an OCaml type that cannot be captured by an invariant, we might find ourselves using an axiom. However, if we quantify over an OCaml type, we must apply some lens in order to access the model of the value. To give an example, let's assume we wish to state that all queues are non-empty:

(*@ axiom q_empty : forall q : 'a t, q <> Sequence.empty *)

When translating into Separation Logic, we must ensure that we correctly apply the representation predicate in order to use its model to express the desired behaviour. One possible translation is as follows:

Parameter q_empty :
forall A q_loc (q : sequence A),
  Queue q_loc ~> q -*
  \[q = Sequence.empty] * Queue q_loc ~> q

The general idea is similar to what we saw with specifications: we create a value of type loc that describes the memory location of the OCaml value as well as a value with the logical representation of this value. We also insert an application of a representation predicate that describes ownership of the value. One of the few changes is that now we have a separating implication stating that if we have ownership of the queue, then we can derive the desired property, all the while remaining with ownership of the value.

Prop and Bool

In current Gospel, there is an internal distinction made between booleans and propositions, although this distinction is not made apparent to users. As part of my work, I have changed the Gospel type checker so that this distinction is not made. This makes translations into CFML a bit trickier since Coq does make an explicit distinction between the two. For example:

function ex (x : bool) (y : bool) =
    true -> x && y

If we were to translate this term verbatim into Coq we would get a typing error, since the terms true and x && y evaluate to booleans and the Coq implications require values of type Prop. The simplest way to solve this is by normalizing the resulting Coq program so that all booleans and boolean related operations are translated into either Prop or bool. We chose the former since it makes proofs easier to write, seeing as if all the variables were booleans, we would have to frequently make explicit conversions to Prop. The downside is that some cases require somewhat verbose conversions from Prop to bool.

How the translation is done:

Most cases are quite straightforward: any function arguments of type bool are turned into Prop and all built in logical symbols are translated into Coq equivalents that operate on the Prop type. There are a few details however worth noting:

Equality

In Gospel the proposition p1 = p2 <-> p1 <-> p2 for all p1 and p2. However, in Coq, since we are targeting the Prop type, this is not valid. Conveniently, CFML uses TLC, which assumes the propositional extensionality, meaning that the statement above is valid without any extra configuration. Iris could still be problematic if propositional extensionality is not assumed.

Pattern Matching

Pattern matching over Prop is impossible in Coq, meaning that we must modify Gospel expressions that use boolean pattern matching so that it is valid in Coq. Assuming an expression match e with ..., when we translate into Coq, e will be of type Prop, meaning we must first transform it into a bool. For example:

match true || false with ...

Is translated to:

match Prop_to_bool (True \/ False) with ...

In the case where the pattern is a variable name, then we have to redefine a new variable with the same name but of type Prop. For example:

match true with
|x ->  x -> true

Is Translated To:

match Prop_to_bool True with
|_x -> let x := Prop_to_bool _x in x -> true
end

Program Variables

When we have an OCaml function that receives an argument of type bool, we cannot simply transform it into an argument of type Prop seeing as CFML would require an encoder for the Prop type which cannot exist. Therefore it must remain of type bool, and within the spec must be turned into a Prop. For example:

val flip : bool -> bool
(*@ r = flip b
    ensures r <-> b *)

Translates to:

forall (_prog_b : bool) (b : Prop),
SPEC(flip _prog_b)
PRE(_prog_b ~> Bool b)
POST(fun _prog_r : bool -> _prog_r ~> Bool b)

Bazinga

Turns out this is all bad and we should keep the internal distinction ๐Ÿคท

Gospel Type Semantics

The semantics of our Separation Logic translation are fairly straightforward, it has triples, axioms, functions and representation predicates. However, since the typing discipline is not as obvious, I will give a little bit of context.

All type declarations are pure in the intermediate representation. When we are faced with an OCaml type that has a mutable model field, this type disappears from the intermediate representation, being subsumed into the loc type. Additionally, within Gospel terms, the user is never allows to refer to any fields or constructors within the OCaml type and is only allowed to talk about its logical model, even if the type is pure (unless, of course, if the model is a reflection of the OCaml type as described in Section Pure Annotations.). If at some point we find it useful to talk about OCaml values in Gospel terms (e.g. representation predicates) these semantics might have to be adjusted. One possibility would be to add "field" declarations similar to what Viper does. This way, we remain only with pure types but with each of the fields declared.

Additionally, all types are assumed to be inhabited. At the moment, Gospel does not support any construct to create empty types, meaning it is impossible to generate any inconsistencies.

Higher order lenses

Currently, lenses are first order, meaning we cannot claim recursive ownership of polymorphic data types. This greatly limits their usefulness since we can basically only use them to choose if we want to claim ownership of a mutable data structure or not. If we allowed lenses to be higher order, we could capture in our specs, for example, containers that store mutable values. This way we greatly increase the amount of programs Gospel can specify as well as allowing for a higher level of granularity when describing ownership conditions.

Lemmas

Translating lemmas into The tricky part is that when we translate terms from Separation Logic into CFML, we transform them into values of type hprop. This means that Coq will not require the user to prove that their axioms are sound. To remedy this, we translate axioms as follows:

(*@ axiom A : P *)
Parameter A : \[] ==> \[P]

Since A is now a Prop, Coq will now require that the user prove its correctness. Although this will work if the only ownership clauses refer to variables defined within the lemma, since these only generate conditions where ownership is only claimed on the left hand side of a magic wand (see previous section), this will not work if the axiom refers to global variables.

Open Questions

Impure Types in Logical Terms

As stated previously, when we have a type with multiple model fields, we create in our intermediate representation a record type with each field. This record type will then be the model for this type. However, with the Gospel specification, this type does not exist, which leaves us with the following question: in the following case:

type t = {ocaml_x : int; ocaml_y : int}
(*@ mutable model x : int
    mutable model y : int *)

val f : t -> unit
(*@ f arg
    ensures arg.x = arg.y *)

What is the type of arg within the specification? The clearest answer would be that it is of type t, however, type t is an OCaml type. This is problematic since these may only be used in specifications when they are pure. Current Gospel allows users to refer to the model fields and the program fields within the spec. I changed this so that internally, arg is of type t, but the user is only allowed to access the model fields x and y and not the fields ocaml_x and ocaml_y. This way, there is a clear divide between the OCaml and the Gospel world. Additionally, users may still refer to the model of t in situations such as:

(*@ predicate test (arg : t) =
    arg.x = arg.y *)

This might be inadequate when we add representation predicates in Gospel which will require us to access their program fields. A few alternatives to this would be:

  • Disallow named model fields and require the user to create an explicit ghost record type (seems annoying).
  • Have some special name for the model of t, (seems annoying to program since finding a naming convention that is both clear and does not clash with any other names would be quite tricky. Also seems annoying on the user side that they have to write the name of a type that does not exist explicitly in the spec)

TODO: duplicable definition (only needed for CFML) Lemma duplicable : forall s m, Duplicable (s ~> pstack m).

s ~> pstack m ==> s ~> pstack m * s ~> pstack m

Recursive Ownership of Elements

Currently in Gospel, we have to way of describing recursive ownership of data structures with mutable elements.

The Problem

If we wished to specify, for example, a stack where its elements could be impure, we would like to have a Gospel specification such as:

type 'a t
(*@ model : 'a sequence *)

val peek : 'a t -> 'a
(*@ r = pop q
    requires q <> empty
    preserves q @ 'a t
    produces r @ 'a
    ensures r = hd q *)

If we assume that 'a can be a mutable value, then our translation into Separation Logic should look like:

Parameter Stack : {A} {B} `{Enc A} (l : loc)
              (HO: A -> B -> hprop) (model : sequence A).

Parameter peek_spec :
forall A `{Enc A} q_loc HO q,
PRE(Stack q_loc HO q)
CODE(peek s)
POST(r_loc => exists r,
       HO r_loc r * Stack q_loc HO q * \[hd q = r])

Although this makes intuitive sense, it is unsound, seeing as we are duplicating the permission for the value stored in r_loc.

Magic Wand (in Gospel)

There are a few possibilities to circumvent this issue. The most obvious is using the magic wand: in the Gospel spec we could have something along the lines of:

val get : 'a ref -> 'a
(*@ v = get r
    consumes r @ 'a t
    produces v @ 'a
    produces r @ val t until r *)

The final produces clause would mean, intuitively, that we return q with no ownership of its elements. The until operator(?) signifies that when ownership of r, as described in the first produces clause, has been surrendered we regain ownership of q as is described in the consumes clause. The post condition of the previous specification would now look something like:

forall A `{Enc A} q_loc HO q,
POST(r_loc => exists r,
       HO r_loc r * Stack q_loc val q * \[hd q = r] *
       (HO r_loc r *-> Stack q_loc HO q)
       )

Exceptional Specifications

With the addition of ownership clauses in Gospel, we also need a way of describing ownership in cases of exceptional termination. The simplest (and indeed the most common case) is when we return full ownership of the data structure.

val pop : 'a t -> 'a
(*@ r = pop p
    modifies p
    raises Not_found begin
      produces p
      ensures p = old p = empty
    end
    ensures r = hd (old p)
    ensures p = tl (old p) *)

Problems with ownership clauses

In the nested specification, we state that we return full ownership of p as well as stating that p is unchanged and is equal to the empty sequence. Since this is an exceptional condition, ensures and produces are the only clauses that make sense to use. A few things about this specification are not ideal. Specifically, having to explicitly write produces and that the data structure is not modified can be a bit heavy. There are a few possible way to remedy this, none of them are fully satisfying:

  • We could allow the use of the preserves clause within exceptional postconditions and assume it to be the default. This is not ideal since preserves is syntactic sugar for a produces and a consumes and as I have stated, it does not make sense to have a produces within exceptional postconditions. We could have it so a preserves clause has a different semantics when used within an exceptional specification. From a user perspective this is not that bad since it is an intuitive use of the preserves clause. However, having it mean something different based on context makes the semantics a bit messy. (The more I think about this, the more I prefer this solution. Although this means that the desugaring of preserves is context dependent, it makes intuitive sense and does not break things in a fundemental way).
  • We could have a default which states that if no ownership clause is given then we assume we return the same ownership that we receive, that is, the exceptional produces clause is the same as the non-exceptional consumes clause. Additionally, we would also assume an ensures clause stating that the variable is unchanged. This is also not ideal because it would make it so produces clauses of exceptional specifications have a different default than
  • We could have a default that if no ownership clause is given then we assume the non-exceptional produces clause.

Changes to the Gospel syntax

There are now three ways to write exceptional postconditions:

  • A list comma separated identifiers

    raises E1, E2, E3
    
  • A nested specification where the only valid clauses are produces and ensures. The values x1 x2 ... are the exception's arguments, these may be omitted.

    raises E (x1, x2 ..) begin
      produces ...
      ensures ...
    end
    
  • A single exceptional postcondition.

    raises E (x1, x2, ...) -> P
    

    Which is just sugar for

    raises E (x1, x2 ...) begin
      ensures P
    end
    

In the latter two cases , all exceptional return values have to be named (i.e.) no wildcards are allowed, similar to what Gospel requires with regards to normal return values (either they are all named or none are named).

TODO Functor Constraints

Just like we can add constraints to OCaml functor applications, we should also be allowed to add additional Gospel constraints. For example:

module Make
  (H : HashedType)
  (S : SENTINELS with type t = H.t)
  : SET with type element = H.t
  (*@ with predicate equiv = H.equiv and
           function  tomb = S.tomb   and
           function  void = S.void
           *)

Updated syntax

In Gospel there is no way to talk about ownership of values when an exception is raised. To this end, we propose an updated syntax to allow users to write multiple specifications for each possible raised exception.

Currently, Gospel specifications start with a header where we define the names of the arguments and return values, which is then followed up with pre and postconditions and other clauses. Our proposal is to have an alternate syntax where the header appears after clauses that describe the program state before the function is called (requires, consumes, etc) and before clauses that describe the state after the function returns. For example, the specification for a pop function of a queue would be written as:


val pop : 'a queue -> 'a
(*@ consumes q
    requires not (empty q)
    let x = pop q in
    produces q
    ensures x :: q = old q *)

A peculiarity of this representation is that the function's arguments are used before they are named in the function call. This is similar to what Gospel does currently, where free variables used in the function call are assumed to be universally quantified, but a bit more non-standard.

In the special case where there are no exceptional clauses, the syntax would be similar, but with no in.

val pop : 'a queue -> 'a
(*@ requires q <> empty
    let x = pop q in *)

In the case of a function that may raise some exception, we would use a match to describe its behavior in the normal case and the exceptional case. For example, if we wish to describe the pop function without using the precondition not (empty q) and instead with an exceptional postcondition.

val pop : 'a queue -> 'a
(*@ consumes q
    match find q p with
    |exception Not_found ->
       produces q
       ensures p = old p = Sequence.empty
    |r ->
       produces q
       ensures x :: q = old q *)

In this case, since we consume q and produce it in the exceptional and non-exceptional specifications, we can use a modifies clause which is sugar for a consumes and a produces in every postcondition.

val pop : 'a queue -> 'a
(*@ modifies q
    match find q p with
    |exception Not_found ->
       ensures p = old p = Sequence.empty
    |r ->
       ensures x :: q = old q *)

Naturally, integrating these changes would deprecate the current Gospel syntax for specifications.

Shared Resources

There is currently no way of capturing shared permissions in Gospel, all values are exclusively owned by the functions that use them. To solve this, we could take inspiration from the Leaf library and add a notion of resources guarding other resources.

In Leaf, the authors introduce the โค” operator where G โค” P means that the resource G guards the resource P. In other words, we can exchange exclusive ownership of resource G to obtain shared ownership of a resource G.

Mutex example (monomorphic)

Specification for a mutually exclusive lock that protects a single reference.

type mutex
(*@ model protectee : loc *)

val create : unit -> mutex
(*@ consumes s @ int ref
    create [s : int ref] *)

val lock : mutex -> unit
(*@ lock m
    produces m @ acquired (* not strictly nec*)
    produces m.protectee @ int ref *)

val unlock : mutex -> unit
(*@ consumes m.protectee @ int ref
    consumes m @ acquired
    unlock m *)

val try_lock : mutex -> bool
(*@ let b = lock m in
    produces b -> m.protectee @ int ref *)

Mutex example

type (*@ 's *) mutex

(*@ affine lens acquired (m : 's mutex) *)

val create : unit -> (*@ 's *) mutex
(*@ consumes 's
    create () *)

val lock : (*@ 's *) mutex -> unit
(*@ lock m
    produces 's *)

val unlock : (*@ 's *) mutex -> unit
(*@ consumes 's
    unlock m *)

val try_lock : (*@ 's *) mutex -> bool
(*@ try_lock m
    produces b -> 's *)

Explicitly guarded mutex

type 'a mutex

type 'a guard
(*@ ephemeral *)

val create : 'a -> 'a mutex
(*@ consumes s
    create s *)

val lock : 'a mutex -> 'a guard
(*@ g = lock m *)

val unlock : 'a guard -> unit
(*@ consumes g @ 'a guard
    unlock g *)

val peek : 'a guard -> 'a
(*@ s = peek g
    produces g @ 'a guard <-> s @ 'a *)

Explicitly guarded mutex v2

type 'a mutex

type 'a guard

val create : 'a -> 'a mutex
(*@ consumes s
    create s *)

val lock : 'a mutex -> 'a guard
(*@ g = lock m *)

val unlock : 'a guard -> unit
(*@ consumes l @ 'a guard
    unlock l *)

val peek : 'a guard -> 'a
(*@ s = peek g
    produces g @ 'a guard == s @ 'a *)

val try_lock : (*@ 's *) mutex -> bool
(*@ try_lock m
    produces b -> 's *)

Shared Lock

type (*@ 'a *) rw_lock
(*@ ephemeral *)

(*@ type 'a exc
    ephemeral
    model m : loc
    model state : 'a *)

(*@ type 'a sh
    guards : 'a
    model m : loc
    model state : 'a *)

val rwlock_new : unit -> (*@ 'a *) rw_lock
(*@ consumes s
    create [s : 'a] *)

val rwlock_free : (*@ 'a *) rw_lock -> unit
(*@ consumes m
    rwlock_free m *)

val lock_exc : (*@ 'a *) rw_lock -> unit
(*@ guarded  m
    let [e : exc] = lock m in
    produces e @ 'a exc
    ensures e.m = &m *)

val unlock_exc : (*@ 'a *) rw_lock -> unit
(*@ guarded m
    requires e.m = &m
    consumes e @ 'a exc
    unlock [e : exc] m *)

val lock_shared : (*@ 'a *) rw_lock -> unit
(*@ guarded m
    let [e : sh] = lock m in
    produces e @ 'a exc
    ensures e.m = &m *)

val unlock_shared : (*@ 'a *) rw_lock -> unit
(*@ guarded m
    requires e.m = &m
    consumes e @ 'a exc
    let () = unlock [e : exc] m *)

Syntax for the Gospel Standard Library

When creating sequences, sets and bags, users are forces to use the functions exposed in the standard library. For example, if one were to create a set with every even number, they would have to write

Set.init (fun x -> x mod 2 = 0)

This is not ideal, since it is very verbose as well as starkly different from what one would expect from a mathematical definition. In this section, I will describe possible changes to the Gospel syntax to simplify writing specifications using the objects defined in the standard library.

The empty element

  • Sequences : []
  • Sets : {}

Initialization

  • Sequences : [ x; y; z ] creates a sequence containing elements x, y, z. These can be identifiers or formulas, but they naturally need to have the same type. A consequence of this is that the syntax for sequences would be the same for lists.
  • Sets : { x | fmla } creates a set with every element x that satisfies fmla, which must be a valid proposition. An alternate way to create a set would be { x; y; z } which creates a set with the values x, y, z. These can be identifiers or formulas, but they naturally need to have the same type.

Set operations

Miscellaneous changes

Section for minor changes I made in Gospel that aren't relevant for other sections.

old in Gospel function specifications

In Gospel we can use the old constructor in the specification of logical functions specifications. In my opinion, this should not be allowed since there is no practical reason to talk about the "old" version of a variable that is never modified.

Gospel documentation

Fib numbers as first spec and machine integers add requires for fibonacci to prevent overflow

Rewrite the first spec for clarity

fibonacci with big int

Internal Changes to Gospel

DONE Normalise AST constructors for function applications

  • State "DONE" from "TODO" [2025-02-14 Fri 13:13]

Currently Gospel has four constructors for function applications:

  • Tidapp for application of an identifier to a list of arguments.
  • Tapply for arbitrary function applications
  • Tinfix for infix operations that can be chained together (e.g.

3 < 4 < 5)

  • Tbinop for binary operations.

From my review of the code, I believe in the untyped AST we only need two constructors, one for arbitrary function applications and another for infix operations. The later isn't strictly necessary, however, during parsing the only way of desugaring infix operators into a Tapply would be with either completely changing the parsing of terms in a way that is not obvious to me or during parsing modifying the AST so that chained infix operators get turned into a chain of /\ which would be a bit painful. Right now, this transformation is done during typechecking which also feels wrong. I believe that the best way of going about doing this is to replace the Tinfix node with something that better describes a chain of infix operators.

I also found a funny bug. The following program should be accepted, but isn't:

#+beginexample ocaml (*@ function (<) (n : 'a) (m : 'a) : bool *)

(*@ function works : bool = (1 < 2) < true *)

(*@ function wrong : bool = true < (1 < 2) *) #+endsrc

TODO Gospel Type checker with Inferno

After doing some experiments with Inferno, I believe that is it a better alternative to current Gospel typechecking, given the following pros:

  • Since Inferno is very high level and declarative, it becomes very easy to read, maintain and extend the type checker, as opposed to current Gospel which has to deal with all of the low level details of type checking and inference.
  • It is less error prone since we are offloading a lot of the weight into Inferno, meaning we only have to ensure that all of our constraints are correct.

Things left to do

  • [-] Type declarations
    • [ ] Mutually Recursive type declarations
    • [X] Abstract type definitions
    • [X] ADTs
    • [ ] Record types
    • [X] Type aliases
  • [X] Logical Function declarations
    • [X] Recursive definitions
  • [ ] Modules
    • [ ] Module Alias
    • [ ] Functors
    • [ ] Module Constraints
      • [ ] With type
      • [ ] With module
  • [ ] Module types
    • [ ] List of Signatures
    • [ ] Functors
    • [ ] Module type Constraint
      • [ ] With type
      • [ ] Type substitution
    • [ ] Module alias
  • [ ] Value descriptions
  • [ ] Module opens
  • [ ] Exceptions
  • [-] Terms
    • [ ] Told
    • [X] Pattern matching
    • [ ] Records
    • [ ] Constructors

Issues

  • During name resolution, we keep all the definitions of all currently processed files in memory. We should save these to a file and read them when necessary.
  • I would prefer if types like integer, prop, etc, were in the Gospel standard library although I am not sure how to do this.
  • We need a way to decode Inferno type variables into names that are legible but do not conflict with other names in scope.

Design of the typechecker.

After parsing a Gospel annotated OCaml file and creating an untyped Gospel AST we go through each signature and perform a series of intermediate checks, these include:

  • Name resolution: In Gospel (like in OCaml), the name we use to reference a variable depends on where we reference it. For example, if we define a function [f] in a module [M], within that module, that function's name is [f], but outside of that module it is [M.f]. If we [open] the module, both names are valid. Since Inferno does not support any of these mechanisms, we must take care to tag every variable with a unique identifier. In our previous example, regardless of what name we use to reference [f], each instance of that variable will be tagged with the same identifier. This means when we build our constraints, it won't matter that the name of the variable is context dependent, since Inferno will only look at the unique identifier. A consequence of this is that that there should be no unbound variables when we build the Inferno constraint, since to tag every variable we must keep track of what variables have been defined.
  • Infix operators: During parsing, we use the [Tinfix] constructor to mark a chain of infix operators. After parsing however, this constructor is redundant since during typechecking, it is equivalent to two nested [Tapply] constructors. Therefore, to simplify building the Inferno constraint, we remove this constructor from all terms and replace it with [Tapply]. Although this step could be done in [Checker], it makes more sense to do it during this phase.
  • Duplicate names: There are some contexts in Gospel where we are not allowed to introduce the same variable twice into scope (e.g. duplicate function arguments). Since Inferno has no way to track this natively, we must handle this manually.

After these checks are complete, we create a copy of the signature where every identifier has been replaced with a uniquely tagged variable. If the signature has any Gospel terms, it is fed into the Inferno solver.

Design of the solver

Note: this section assumes some familiarity with Inferno

Most of the solver is pretty standard: we traverse the Gospel signature, take care to put every variable in scope using def constraints (meaning all local variables are created using a monomorphic scheme) and using the hastype function to infer the types of each sub term. There are, nevertheless, a few features that are not directly supported by Inferno that we had to implement ourselves:

  • Named type variables: When a user gives a name to a Gospel type variable it is considered rigid, meaning it cannot be unified with any other type. This means that Gospel functions such as:

    function x : 'a = 0

    Are ill typed since the rigid type variable 'a cannot be unified with integer. Although Inferno does have enough infrastructure that we could easily introduce a forall combinator that would create a rigid type variable, we went with a different approach. When we encounter a type annotation, we transform it into a deep type (an encoding provided by Inferno to turn types into Inferno variables). When we encounter a named type variable, instead of turning it into a deep variable that can be unified with other types, we create a Tvar structure which can only be unified with other Tvar with the same identifier.

    The reason why we don't use the rigid variables provided by Inferno is because of error messages. When we turn a Gospel type variable into an Inferno variable, there is no way to associate it with its name. This means when the Inferno variable is decoded back into a Gospel type variable, Inferno does no provide us with a way to give it its original name. This is problematic because it would mean error messages could not display the correct Gospel type. For example:

    function f (s1 : 'a sequence) (s2 : 'b sequence) = s1 = s2

    Should produce the message "mismatch between the type 'a sequence and 'b sequence". However, there is no way for Inferno to get back to the original names 'a and 'b.

    Note: This works because we only use def constraints to introduce new variable names which means that each variable has a monomorphic type scheme, if we switch to a polymorphic type scheme, we probably need to use Inferno's rigid type variables.

  • Top level definitions: if we feed the following signature into Inferno axiom eq : x = y it will raise an Unbound exception, since the names x and y are not defined within the scope of the axiom. However, it is possibly that the names x and y were defined previously in the top level. Since the Inferno solver cannot be used in such a way that it "remembers" previous constraints it solve, we mark variables as local during name resolution if they are defined in the term. When we are building a constraint, if we encounter a variable x that is not local, we go through the environment we create during name resolution and find the type of x.

    When we turn the type of x into a deep type, its type variables are turned into flexible Inferno variables using the exist binder, meaning they can be unified with other types. This ensures that top level definitions are let polymorphic.

    Note: although it is technically possible to map a list of top level signatures to a sequence of nested def constraints, this is not worth it since it leads to a very messy implementation where the complexity added is much greater than simply keeping track of the types of top level signatures.

DONE Inferno Efficiency

  • State "DONE" from "๐Ÿ‘๏ธ๐Ÿ‘„๐Ÿ‘๏ธ" [2025-02-14 Fri 13:13]

One possible downside is that I'm not sure with regards with efficiency how Inferno behaves. More specifically, when does Inferno populate the data structures it needs for type unification? Is it when the constraint is built or when solve is called? If the later, there are some real questions with regards to efficiency

๐Ÿ‘๏ธ๐Ÿ‘„๐Ÿ‘๏ธ Gospel specifications in OCaml AST

In the design document Nicolas mentioned that Gospel specifications should be inserted directly in the OCaml AST instead of in a copy of it. I don't know how that would work in practice.

DONE Recursive flag in OCaml AST for types

  • State "DONE" from "๐Ÿ‘๏ธ๐Ÿ‘„๐Ÿ‘๏ธ" [2025-03-05 Wed 14:15]

In the OCaml AST, types are always marked as recursive, I don't understand why.

Answer: turns out OCaml has a nonrec annotation that allows you to specify that a type is non recursive.

Emacs 29.3 (Org mode 9.6.15)