Research Notes
Table of Contents
- Type declarations
- Value descriptions
Prop
andBool
- Gospel Type Semantics
- Higher order lenses
- Lemmas
- Open Questions
- Exceptional Specifications
- TODO Functor Constraints
- Updated syntax
- Shared Resources
- Syntax for the Gospel Standard Library
- Miscellaneous changes
- Gospel documentation
- Internal Changes to Gospel
Type declarations
When translating a type declaration into separation logic, we have to
create, besides the type definition, a representation predicate that
receives a program variable and lifts it into a logical
representation. This logical representation is defined using the model
field in the type specification.
Unnamed Model Fields
One of the changes we made to the Gospel typechecker was allowing
model
fields to be unnamed. For example, if we wish to specify a queue
in current gospel whose logical model is a sequence
, we must write as
such.
type t (*@ model view : int sequence *) val push : t -> int -> unit (*@ push q x ensures q.view = old q.view ++ [x] *)
Although the model of the queue is the only thing that interests us in
this specification (and indeed, most specifications) we still have to
access it via the field view
. This can not only be cumbersome but also
make specifications more verbose than they need be. Although the
example above is still valid in Gospel, we have developed an alternate
syntax for cases where types only have a single model
field.
type t (*@ model : int sequence *) val push : t -> int -> unit (*@ push q x ensures q = old q ++ [x] *)
In the second code snippet, q
in the specification has the same type
as its model field. This way, the specification becomes a bit easier
to write and read.
Multiple Model Fields
When creating the representation predicate for a type with named model
fields, we have the option between creating a predicate with multiple
fields for the logical values or creating a record type with each
field described in the specification. I believe that the sanest way to
do this is by using ghost record type. This is because multiple fields
can make the translation more illegible, as well as more difficult to
execute, seeing as I would have to replace all occurrences of model
field accesses as well as manage potential problems with clashing
names.
Example (assuming x
is of type t
with two model fields y
and z
) :
(*@ f x y z ensures x.y + y = z *)
would become, during translation
{}(f(x, y, z)){P(x, y, z) * y + y = z}
While typing, I assume just one type and, in the specification, handle
the model fields as ghost
fields (which is already the default
behaviour). While translating to separation logic, I create two types,
one for the program type and another for the model.
Naming problems
The tricky question to solve however is what names to give these types? I think the easiest solution is to give the logical type the same name as the OCaml type and give the program type the same name, but with an underscore at the beginning. This way, we don't have to change anything in the typed specification.
Ephemeral and Pure Annotations
It is also possible to not give the type a model
field and instead opt
to give an ephemeral
annotation as such
type t (*@ ephemeral *)
This way, we can mark the type as mutable without giving a precise definition of its logical representation. If the type has no annotations, then we assume it to be pure and its model is a reflection of the OCaml type.
Representation Predicates
As stated previously, when we translate a type declaration, we also need to generate an appropriate representation predicate. How we do so depends on how the type is defined.
Abstract Types
Since we cannot give a representation predicate to a type that is
undefined in the program, we simply create its signature. This
predicate will have two arguments, the program value and the logical
representation, that is, the type defined in its model
clause.
- If the type is pure, then the program value is the same as the OCaml
type. In the case where the
model
is a reflection of the OCaml type as described in Section Ephemeral and Pure Annotations, no representation predicate is generated - If the type is mutable, then the program value is of type
loc
.
Concrete Types
TBD
Note on naming
Generally, the generated representation predicate has the same name
as the original type, but with the first letter capitalized and an
underscore at the start. However, when the type's name is t
, the
representation predicate gets the same name as the name of the module,
so as to make the generated Separation Logic triples simpler to read.
Translation of Type declarations
type t (*@ model : int *)
Translates into
Type t Predicate _T (target : t) (model : int)
type t (*@ mutable model : int *)
Translates into
Predicate _T (target : loc) (model : int)
type t (*@ model x : int model y : int *)
Translates into
Type t = {x : int; y : int} Type _t Predicate _T (target : t) (model : _t)
type t (*@ mutable model x : int model y : int *)
Translates into
Type t = {x : int; y : int} Predicate _T (target : loc) (model : t)
Implementation
- Check if the type has a specification. If so, retrieve the model, if not, assume the model is a reflection of the OCaml type.
- We must check if the type has multiple model fields. If so, we create an optional value that has the definition of a record type with all these fields. This type is always marked is non-mutable. The name of this type is the same as the name of the original type.
- We create a definition of the type. The name of this type depends if we created a record in the previous step. If so, the name is the old name with an underscore at the start. Otherwise, the name is unchanged. For now, we only support abstract types.
- We create the definition for the representation predicate. If the
type is mutable, the first argument is of type
loc
, otherwise, the first argument is the program type. The next argument is the type's model, as was defined previously.
Limitations
- When translated into CFML, record fields with names that have already been defined don't work (Low priority).
- No support for polymorphic types with recursive ownership.
- No support for concrete type definitions. This can be quite tricky
since in these cases we would need to define an explicit
representation predicate in Gospel, which is not yet possible.
- Additionally, we must consider the relationship between invariants and representation predicates.
Value descriptions
When translating a value description and its specification, we must take proper care to ensure that all the ownership conditions are properly generated and are semantically equivalent to their Gospel counterparts.
Lenses
To describe ownership of values, we use what are called Lenses;
lightweight notation to encode representation predicates. For example,
let us assume once again a queue whose logical model
is a sequence of
integers:
type t (*@ mutable model : int sequence *) val push : t -> int -> unit (*@ f q x modifies q ensures q = q ++ [x]*)
As I stated in the previous section, the variable q
has type int
sequence
in the specification. However, in the OCaml interface, it has
type t
. To make this transition, we require a representation predicate
(in Gospel's case, a lens) that lifts the program value into
its logical value. In this case, we could write
val push : t -> int -> unit (*@ f q x modifies q @ t ensures ... *)
The modifies
clause states that we receive ownership of the OCaml
variable q
using the lens t
and lift it into the logical type
stated in t
's model, that is, a sequence of integers. When we use the
lens with the same name as the original type (which we will
refer to as the default lens), we say that we have full
ownership of that value. Additionally, the modifies
clause also states
that we return the same ownership of q
. Given these semantics, we
could translate this specification into the following Separation Logic
triple (in CFML syntax for simplicity's sake).
forall (prog_q : loc) (q : int list) (x : int) PRE (prog_q ~> Queue q) POSTUNIT (prog_q ~> Queue (q ++ [x]))
Important to note that lenses are not exclusive used in specifications as we see in the Section on Logical Quantifiers, however, we will mostly focus on this use case as it is the most common as well as the simplest.
Ownership clauses
To describe ownership with lenses, we can use one of the following clauses:
consumes x @ t
: states that the function receives ownership of a valuex
as described by the representation predicatet
.produces x @ t
: states that the function returns ownership of a valuex
as described by the representation predicatet
.modifies x @ t
: states that the function receives and returns ownership of a valuex
as described by the representation predicatet
. Note that Gospel treats this clause as syntactic sugar for aconsumes
andproduces
clause.preserves x @ t
: states that the function receives and returns ownership of a valuex
as described by the representation predicatet
. Note that this is syntactic sugar for aproduces
andconsumes
clause coupled with anensures
clause stating thatx
is unchanged.
If no ownership clause is defined for a function argument, then Gospel
assumes a preserves
clause with full ownership. In the case of return
values, we assume a produces
with full ownership.
Available Lenses
For now, besides the default lens, we only have access to two lenses:
- The
val
lens, that gives us no ownership of a value The
loc
lens, that gives us no ownership of a value, which can only be used for mutable values. The reason why there are two different lenses that have similar purposes is to specify functions that feature physical equality. For example, due to how physical equality is defined in OCaml, it may be desirable to define it only for the case when it receives two memory locations. To do this, we can state:val ph_eq : 'a -> 'a -> bool (*@ ph_eq x y preserves x @ loc, y @ loc ensures x = y *)
This way, this function can only be called with values with which we can apply the lens
loc
, in other words, mutable values.
Limitations
- No support for polymorphic lenses, in other words, higher order representation predicates.
- No support for group ownership.
Translations of Lenses
This section will showcase some simple examples of how lenses are translated into Separation Logic. To showcase, I will show an example of a stack module, its Gospel specification and its CFML translation.
type 'a t (*@ mutable model : 'a Sequence.t *)
Translates into
Parameter Stack : forall A {Ih : Inhab A} {EA : Enc A}, sequence A -> loc -> hprop
A few details about this translation:
- The type definition is omitted in the final Coq code and only the representation predicate remains. This is because the type is mutable which is not allowed in our Separation Logic encoding.
- When defining the representation predicate, we also state that the
type argument
A
is inhabited, an assumption that Gospel makes. Additionally, we also state that the type can be encoded in OCaml. The first assumption is made for all type arguments whereas the later is only made for type arguments of OCaml functions. - The first argument of the predicate is the logical model, the second argument is the CFML type for memory locations. Its return is a separation logic proposition.
- The name of the representation predicate (
Stack
) is automatically inferred from the module name.
val create : unit -> 'a t (*@ q = create () ensures q = empty *)
Parameter _create : CFML.Semantics.val. Parameter _create_spec: forall (A : Type) (Ih : Inhab a) (EA : Enc a), SPEC_PURE (_create tt) POST (fun _prog_q : loc => _prog_q ~> Stack_cfml empty)
A few more details
- Since the original specification does not have any ownership clauses
that reference the return value, we assume
produces q @ t
. - Just like in the original program, we create an abstract function definition as well as a triple.
- The
create
specification returns a variable_prog_q
which is not present in the original specification. This is because theq
return value represents the logical model of the stack, whereas_prog_q
represents its memory location, which is not explicitly defined in the Gospel side. - Interestingly, the
q
variable is not present in the generated specification. This is because in this case the Gospel specification can be captured without it, although there are cases where it must be defined via existential quantification.
val is_empty : 'a t -> bool (*@ b = is_empty q ensures b <-> q = empty *)
Translates into:
Parameter _is_empty : CFML.Semantics.val. Parameter _is_empty_spec : forall (A : Type) (IH : Inhab a) (EA : Enc a) (_prog_q : loc) (q : sequence a), SPEC (_is_empty _prog_q) PRE _prog_q ~> Stack_cfml q POST (fun _prog_b : bool => _prog_q ~> Stack_cfml q \* _prog_b ~> Bool (q = empty))
More details:
- Since the Gospel specification does not have any ownership clauses
regarding the variable
q
, we assumepreserves @ t
. - Since the
q
variable is unmodified and its ownership is preserved, we use the representation predicate in both the pre and post conditions to claim ownership. - The
Bool
representation predicate allows us to jump from the OCamlbool
type to the CoqProp
type.
val push : 'a t -> 'a -> unit (*@ push p x modifies p ensures p = cons x (old p) *)
Translates into:
Parameter _push : CFML.Semantics.val. Parameter _push_spec : forall (a : Type) (aIh : Inhab a) (Ea : Enc a) (_prog_p : loc) (p : sequence a) (x : a), SPEC (_push _prog_p x) PRE _prog_p ~> Stack_cfml p POSTUNIT _prog_p ~> Stack_cfml (cons x p)
val pop : 'a t -> 'a (*@ r = pop p modifies p raises Not_found checks p <> empty ensures (old p) = cons r p *)
Translates into:
Parameter _pop : CFML.Semantics.val. Parameter _pop_spec : forall (a : Type) (aIh : Inhab a) (Ea : Enc a) (_prog_p : loc) (p : sequence a), p <> empty -> SPEC (_pop _prog_p) PRE _prog_p ~> Stack_cfml p POST (fun r : a => \exists _p_ : sequence a, _prog_p ~> Stack_cfml _p_ \* \[p = cons r _p_])
Some more details:
- In the postcondition for this spec, we must explicitly lift the
pointer
_prog_p
using an existentially quantified variable. - In the original Gospel spec, we have a
raises
clause stating that the function raises theNot_found
exception, as well as araises
clause stating that if the stack is not empty, no exception is raised. Since CFML does not support exceptions, we state that the generated CFML spec is valid if the stack is not empty.
Logical Quantifiers
If we wish to state a property universal to an OCaml type that cannot be captured by an invariant, we might find ourselves using an axiom. However, if we quantify over an OCaml type, we must apply some lens in order to access the model of the value. To give an example, let's assume we wish to state that all queues are non-empty:
(*@ axiom q_empty : forall q : 'a t, q <> Sequence.empty *)
When translating into Separation Logic, we must ensure that we correctly apply the representation predicate in order to use its model to express the desired behaviour. One possible translation is as follows:
Parameter q_empty : forall A q_loc (q : sequence A), Queue q_loc ~> q -* \[q = Sequence.empty] * Queue q_loc ~> q
The general idea is similar to what we saw with specifications: we
create a value of type loc
that describes the memory location of the
OCaml value as well as a value with the logical representation of this
value. We also insert an application of a representation predicate
that describes ownership of the value. One of the few changes is that
now we have a separating implication stating that if we have ownership
of the queue, then we can derive the desired property, all the while
remaining with ownership of the value.
Prop
and Bool
In current Gospel, there is an internal distinction made between booleans and propositions, although this distinction is not made apparent to users. As part of my work, I have changed the Gospel type checker so that this distinction is not made. This makes translations into CFML a bit trickier since Coq does make an explicit distinction between the two. For example:
function ex (x : bool) (y : bool) = true -> x && y
If we were to translate this term verbatim into Coq we would get a
typing error, since the terms true
and x && y
evaluate to booleans
and the Coq implications require values of type Prop
. The simplest way
to solve this is by normalizing the resulting Coq program so that all
booleans and boolean related operations are translated into either
Prop
or bool
. We chose the former since it makes proofs easier to
write, seeing as if all the variables were booleans, we would have to
frequently make explicit conversions to Prop
. The downside is that
some cases require somewhat verbose conversions from Prop
to bool
.
How the translation is done:
Most cases are quite straightforward: any function arguments of type
bool
are turned into Prop
and all built in logical symbols are translated
into Coq
equivalents that operate on the Prop
type. There are a few
details however worth noting:
Equality
In Gospel the proposition p1 = p2 <-> p1 <-> p2
for all p1
and
p2
. However, in Coq, since we are targeting the Prop
type, this is not
valid. Conveniently, CFML uses TLC, which assumes the propositional
extensionality, meaning that the statement above is valid without any
extra configuration. Iris could still be problematic if propositional
extensionality is not assumed.
Pattern Matching
Pattern matching over Prop
is impossible in Coq, meaning that we must
modify Gospel expressions that use boolean pattern matching so that it
is valid in Coq. Assuming an expression match e with ...
, when we
translate into Coq, e
will be of type Prop
, meaning we must first
transform it into a bool
. For example:
match true || false with ...
Is translated to:
match Prop_to_bool (True \/ False) with ...
In the case where the pattern is a variable name, then we have to
redefine a new variable with the same name but of type Prop
. For
example:
match true with |x -> x -> true
Is Translated To:
match Prop_to_bool True with |_x -> let x := Prop_to_bool _x in x -> true end
Program Variables
When we have an OCaml function that receives an argument of type bool
,
we cannot simply transform it into an argument of type Prop
seeing as
CFML would require an encoder for the Prop
type which cannot
exist. Therefore it must remain of type bool
, and within the spec must
be turned into a Prop. For example:
val flip : bool -> bool (*@ r = flip b ensures r <-> b *)
Translates to:
forall (_prog_b : bool) (b : Prop), SPEC(flip _prog_b) PRE(_prog_b ~> Bool b) POST(fun _prog_r : bool -> _prog_r ~> Bool b)
Bazinga
Turns out this is all bad and we should keep the internal distinction ๐คท
Gospel Type Semantics
The semantics of our Separation Logic translation are fairly straightforward, it has triples, axioms, functions and representation predicates. However, since the typing discipline is not as obvious, I will give a little bit of context.
All type declarations are pure in the intermediate
representation. When we are faced with an OCaml type that has a
mutable model field, this type disappears from the intermediate
representation, being subsumed into the loc
type. Additionally, within
Gospel terms, the user is never allows to refer to any fields or
constructors within the OCaml type and is only allowed to talk about
its logical model
, even if the type is pure (unless, of course, if the
model is a reflection of the OCaml type as described in Section Pure
Annotations.). If at some point we find it useful to talk about
OCaml values in Gospel terms (e.g. representation predicates) these
semantics might have to be adjusted. One possibility would be to add
"field" declarations similar to what Viper does. This way, we remain
only with pure types but with each of the fields declared.
Additionally, all types are assumed to be inhabited. At the moment, Gospel does not support any construct to create empty types, meaning it is impossible to generate any inconsistencies.
Higher order lenses
Currently, lenses are first order, meaning we cannot claim recursive ownership of polymorphic data types. This greatly limits their usefulness since we can basically only use them to choose if we want to claim ownership of a mutable data structure or not. If we allowed lenses to be higher order, we could capture in our specs, for example, containers that store mutable values. This way we greatly increase the amount of programs Gospel can specify as well as allowing for a higher level of granularity when describing ownership conditions.
Lemmas
Translating lemmas into
The tricky part is that when we translate terms from Separation Logic
into CFML, we transform them into values of type hprop
. This means
that Coq will not require the user to prove that their axioms are
sound. To remedy this, we translate axioms as follows:
(*@ axiom A : P *)
Parameter A : \[] ==> \[P]
Since A
is now a Prop
, Coq will now require that the user prove its
correctness. Although this will work if the only ownership clauses
refer to variables defined within the lemma, since these only generate
conditions where ownership is only claimed on the left hand side of a
magic wand (see previous section), this will not work if the axiom
refers to global variables.
Open Questions
Impure Types in Logical Terms
As stated previously, when we have a type with multiple model
fields, we create in our intermediate representation a record type
with each field. This record type will then be the model for this
type. However, with the Gospel specification, this type does not
exist, which leaves us with the following question: in the following case:
type t = {ocaml_x : int; ocaml_y : int} (*@ mutable model x : int mutable model y : int *) val f : t -> unit (*@ f arg ensures arg.x = arg.y *)
What is the type of arg
within the specification? The clearest answer
would be that it is of type t
, however, type t
is an OCaml type. This
is problematic since these may only be used in specifications when
they are pure. Current Gospel allows users to refer to the model
fields and the program fields within the spec. I changed this so that
internally, arg
is of type t
, but the user is only allowed to access
the model fields x
and y
and not the fields ocaml_x
and ocaml_y
. This
way, there is a clear divide between the OCaml and the Gospel
world. Additionally, users may still refer to the model of t
in
situations such as:
(*@ predicate test (arg : t) = arg.x = arg.y *)
This might be inadequate when we add representation predicates in Gospel which will require us to access their program fields. A few alternatives to this would be:
- Disallow named model fields and require the user to create an explicit ghost record type (seems annoying).
- Have some special name for the model of
t
, (seems annoying to program since finding a naming convention that is both clear and does not clash with any other names would be quite tricky. Also seems annoying on the user side that they have to write the name of a type that does not exist explicitly in the spec)
TODO: duplicable definition (only needed for CFML) Lemma duplicable : forall s m, Duplicable (s ~> pstack m).
s ~> pstack m ==> s ~> pstack m * s ~> pstack m
Recursive Ownership of Elements
Currently in Gospel, we have to way of describing recursive ownership of data structures with mutable elements.
The Problem
If we wished to specify, for example, a stack where its elements could be impure, we would like to have a Gospel specification such as:
type 'a t (*@ model : 'a sequence *) val peek : 'a t -> 'a (*@ r = pop q requires q <> empty preserves q @ 'a t produces r @ 'a ensures r = hd q *)
If we assume that 'a
can be a mutable value, then our translation into
Separation Logic should look like:
Parameter Stack : {A} {B} `{Enc A} (l : loc) (HO: A -> B -> hprop) (model : sequence A). Parameter peek_spec : forall A `{Enc A} q_loc HO q, PRE(Stack q_loc HO q) CODE(peek s) POST(r_loc => exists r, HO r_loc r * Stack q_loc HO q * \[hd q = r])
Although this makes intuitive sense, it is unsound, seeing as we are
duplicating the permission for the value stored in r_loc
.
Magic Wand (in Gospel)
There are a few possibilities to circumvent this issue. The most obvious is using the magic wand: in the Gospel spec we could have something along the lines of:
val get : 'a ref -> 'a (*@ v = get r consumes r @ 'a t produces v @ 'a produces r @ val t until r *)
The final produces
clause would mean, intuitively, that we return q
with no ownership of its elements. The until
operator(?) signifies
that when ownership of r
, as described in the first produces
clause,
has been surrendered we regain ownership of q
as is described in the
consumes
clause. The post condition of the previous specification
would now look something like:
forall A `{Enc A} q_loc HO q, POST(r_loc => exists r, HO r_loc r * Stack q_loc val q * \[hd q = r] * (HO r_loc r *-> Stack q_loc HO q) )
Exceptional Specifications
With the addition of ownership clauses in Gospel, we also need a way of describing ownership in cases of exceptional termination. The simplest (and indeed the most common case) is when we return full ownership of the data structure.
val pop : 'a t -> 'a (*@ r = pop p modifies p raises Not_found begin produces p ensures p = old p = empty end ensures r = hd (old p) ensures p = tl (old p) *)
Problems with ownership clauses
In the nested specification, we state that we return full ownership of
p
as well as stating that p
is unchanged and is equal to the empty
sequence. Since this is an exceptional condition, ensures
and produces
are the only clauses that make sense to use. A few things about this
specification are not ideal. Specifically, having to explicitly write
produces
and that the data structure is not modified can be a bit
heavy. There are a few possible way to remedy this, none of them are
fully satisfying:
- We could allow the use of the
preserves
clause within exceptional postconditions and assume it to be the default. This is not ideal sincepreserves
is syntactic sugar for aproduces
and aconsumes
and as I have stated, it does not make sense to have aproduces
within exceptional postconditions. We could have it so apreserves
clause has a different semantics when used within an exceptional specification. From a user perspective this is not that bad since it is an intuitive use of thepreserves
clause. However, having it mean something different based on context makes the semantics a bit messy. (The more I think about this, the more I prefer this solution. Although this means that the desugaring ofpreserves
is context dependent, it makes intuitive sense and does not break things in a fundemental way). - We could have a default which states that if no ownership clause is
given then we assume we return the same ownership that we receive,
that is, the exceptional
produces
clause is the same as the non-exceptionalconsumes
clause. Additionally, we would also assume an ensures clause stating that the variable is unchanged. This is also not ideal because it would make it soproduces
clauses of exceptional specifications have a different default than - We could have a default that if no ownership clause is given then we
assume the non-exceptional
produces
clause.
Changes to the Gospel syntax
There are now three ways to write exceptional postconditions:
A list comma separated identifiers
raises E1, E2, E3
A nested specification where the only valid clauses are
produces
andensures
. The valuesx1 x2 ...
are the exception's arguments, these may be omitted.raises E (x1, x2 ..) begin produces ... ensures ... end
A single exceptional postcondition.
raises E (x1, x2, ...) -> P
Which is just sugar for
raises E (x1, x2 ...) begin ensures P end
In the latter two cases , all exceptional return values have to be named (i.e.) no wildcards are allowed, similar to what Gospel requires with regards to normal return values (either they are all named or none are named).
TODO Functor Constraints
Just like we can add constraints to OCaml functor applications, we should also be allowed to add additional Gospel constraints. For example:
module Make (H : HashedType) (S : SENTINELS with type t = H.t) : SET with type element = H.t (*@ with predicate equiv = H.equiv and function tomb = S.tomb and function void = S.void *)
Updated syntax
In Gospel there is no way to talk about ownership of values when an exception is raised. To this end, we propose an updated syntax to allow users to write multiple specifications for each possible raised exception.
Currently, Gospel specifications start with a header where we define the names of the arguments and return values, which is then followed up with pre and postconditions and other clauses. Our proposal is to have an alternate syntax where the header appears after clauses that describe the program state before the function is called (requires, consumes, etc) and before clauses that describe the state after the function returns. For example, the specification for a pop function of a queue would be written as:
val pop : 'a queue -> 'a (*@ consumes q requires not (empty q) let x = pop q in produces q ensures x :: q = old q *)
A peculiarity of this representation is that the function's arguments are used before they are named in the function call. This is similar to what Gospel does currently, where free variables used in the function call are assumed to be universally quantified, but a bit more non-standard.
In the special case where there are no exceptional clauses, the syntax
would be similar, but with no in
.
val pop : 'a queue -> 'a (*@ requires q <> empty let x = pop q in *)
In the case of a function that may raise some exception, we would use a match to describe its behavior in the normal case and the exceptional case. For example, if we wish to describe the pop function without using the precondition not (empty q) and instead with an exceptional postcondition.
val pop : 'a queue -> 'a (*@ consumes q match find q p with |exception Not_found -> produces q ensures p = old p = Sequence.empty |r -> produces q ensures x :: q = old q *)
In this case, since we consume q and produce it in the exceptional and non-exceptional specifications, we can use a modifies clause which is sugar for a consumes and a produces in every postcondition.
val pop : 'a queue -> 'a (*@ modifies q match find q p with |exception Not_found -> ensures p = old p = Sequence.empty |r -> ensures x :: q = old q *)
Naturally, integrating these changes would deprecate the current Gospel syntax for specifications.
Shared Resources
There is currently no way of capturing shared permissions in Gospel, all values are exclusively owned by the functions that use them. To solve this, we could take inspiration from the Leaf library and add a notion of resources guarding other resources.
In Leaf, the authors introduce the โค operator where G โค P means that the resource G guards the resource P. In other words, we can exchange exclusive ownership of resource G to obtain shared ownership of a resource G.
Mutex example (monomorphic)
Specification for a mutually exclusive lock that protects a single reference.
type mutex (*@ model protectee : loc *) val create : unit -> mutex (*@ consumes s @ int ref create [s : int ref] *) val lock : mutex -> unit (*@ lock m produces m @ acquired (* not strictly nec*) produces m.protectee @ int ref *) val unlock : mutex -> unit (*@ consumes m.protectee @ int ref consumes m @ acquired unlock m *) val try_lock : mutex -> bool (*@ let b = lock m in produces b -> m.protectee @ int ref *)
Mutex example
type (*@ 's *) mutex (*@ affine lens acquired (m : 's mutex) *) val create : unit -> (*@ 's *) mutex (*@ consumes 's create () *) val lock : (*@ 's *) mutex -> unit (*@ lock m produces 's *) val unlock : (*@ 's *) mutex -> unit (*@ consumes 's unlock m *) val try_lock : (*@ 's *) mutex -> bool (*@ try_lock m produces b -> 's *)
Explicitly guarded mutex
type 'a mutex type 'a guard (*@ ephemeral *) val create : 'a -> 'a mutex (*@ consumes s create s *) val lock : 'a mutex -> 'a guard (*@ g = lock m *) val unlock : 'a guard -> unit (*@ consumes g @ 'a guard unlock g *) val peek : 'a guard -> 'a (*@ s = peek g produces g @ 'a guard <-> s @ 'a *)
Explicitly guarded mutex v2
type 'a mutex type 'a guard val create : 'a -> 'a mutex (*@ consumes s create s *) val lock : 'a mutex -> 'a guard (*@ g = lock m *) val unlock : 'a guard -> unit (*@ consumes l @ 'a guard unlock l *) val peek : 'a guard -> 'a (*@ s = peek g produces g @ 'a guard == s @ 'a *) val try_lock : (*@ 's *) mutex -> bool (*@ try_lock m produces b -> 's *)
Shared Lock
type (*@ 'a *) rw_lock (*@ ephemeral *) (*@ type 'a exc ephemeral model m : loc model state : 'a *) (*@ type 'a sh guards : 'a model m : loc model state : 'a *) val rwlock_new : unit -> (*@ 'a *) rw_lock (*@ consumes s create [s : 'a] *) val rwlock_free : (*@ 'a *) rw_lock -> unit (*@ consumes m rwlock_free m *) val lock_exc : (*@ 'a *) rw_lock -> unit (*@ guarded m let [e : exc] = lock m in produces e @ 'a exc ensures e.m = &m *) val unlock_exc : (*@ 'a *) rw_lock -> unit (*@ guarded m requires e.m = &m consumes e @ 'a exc unlock [e : exc] m *) val lock_shared : (*@ 'a *) rw_lock -> unit (*@ guarded m let [e : sh] = lock m in produces e @ 'a exc ensures e.m = &m *) val unlock_shared : (*@ 'a *) rw_lock -> unit (*@ guarded m requires e.m = &m consumes e @ 'a exc let () = unlock [e : exc] m *)
Syntax for the Gospel Standard Library
When creating sequences, sets and bags, users are forces to use the functions exposed in the standard library. For example, if one were to create a set with every even number, they would have to write
Set.init (fun x -> x mod 2 = 0)
This is not ideal, since it is very verbose as well as starkly different from what one would expect from a mathematical definition. In this section, I will describe possible changes to the Gospel syntax to simplify writing specifications using the objects defined in the standard library.
The empty element
- Sequences :
[]
- Sets :
{}
Initialization
- Sequences :
[ x; y; z ]
creates a sequence containing elementsx
,y
,z
. These can be identifiers or formulas, but they naturally need to have the same type. A consequence of this is that the syntax for sequences would be the same for lists. - Sets :
{ x | fmla }
creates a set with every elementx
that satisfiesfmla
, which must be a valid proposition. An alternate way to create a set would be{ x; y; z }
which creates a set with the valuesx
,y
,z
. These can be identifiers or formulas, but they naturally need to have the same type.
Set operations
Miscellaneous changes
Section for minor changes I made in Gospel that aren't relevant for other sections.
old
in Gospel function specifications
In Gospel we can use the old
constructor in the specification of
logical functions specifications. In my opinion, this should not be
allowed since there is no practical reason to talk about the "old"
version of a variable that is never modified.
Gospel documentation
Fib numbers as first spec and machine integers add requires for fibonacci to prevent overflow
Rewrite the first spec for clarity
fibonacci with big int
Internal Changes to Gospel
DONE Normalise AST constructors for function applications
- State "DONE" from "TODO"
Currently Gospel has four constructors for function applications:
Tidapp
for application of an identifier to a list of arguments.Tapply
for arbitrary function applicationsTinfix
for infix operations that can be chained together (e.g.
3 < 4 < 5
)
Tbinop
for binary operations.
From my review of the code, I believe in the untyped AST we only need
two constructors, one for arbitrary function applications and another
for infix operations. The later isn't strictly necessary, however,
during parsing the only way of desugaring infix operators into a
Tapply
would be with either completely changing the parsing of terms
in a way that is not obvious to me or during parsing modifying the AST
so that chained infix operators get turned into a chain of /\
which
would be a bit painful. Right now, this transformation is done during
typechecking which also feels wrong. I believe that the best way of
going about doing this is to replace the Tinfix
node with something
that better describes a chain of infix operators.
I also found a funny bug. The following program should be accepted, but isn't:
#+beginexample ocaml (*@ function (<) (n : 'a) (m : 'a) : bool *)
(*@ function works : bool = (1 < 2) < true *)
(*@ function wrong : bool = true < (1 < 2) *) #+endsrc
TODO Gospel Type checker with Inferno
After doing some experiments with Inferno, I believe that is it a better alternative to current Gospel typechecking, given the following pros:
- Since Inferno is very high level and declarative, it becomes very easy to read, maintain and extend the type checker, as opposed to current Gospel which has to deal with all of the low level details of type checking and inference.
- It is less error prone since we are offloading a lot of the weight into Inferno, meaning we only have to ensure that all of our constraints are correct.
Things left to do
[-]
Type declarations[ ]
Mutually Recursive type declarations[X]
Abstract type definitions[X]
ADTs[ ]
Record types[X]
Type aliases
[X]
Logical Function declarations[X]
Recursive definitions
[ ]
Modules[ ]
Module Alias[ ]
Functors[ ]
Module Constraints[ ]
With type
[ ]
With module
[ ]
Module types[ ]
List of Signatures[ ]
Functors[ ]
Module type Constraint[ ]
With type
[ ]
Type substitution
[ ]
Module alias
[ ]
Value descriptions[ ]
Module opens[ ]
Exceptions[-]
Terms[ ]
Told[X]
Pattern matching[ ]
Records[ ]
Constructors
Issues
- During name resolution, we keep all the definitions of all currently processed files in memory. We should save these to a file and read them when necessary.
- I would prefer if types like
integer
,prop
, etc, were in the Gospel standard library although I am not sure how to do this. - We need a way to decode Inferno type variables into names that are legible but do not conflict with other names in scope.
Design of the typechecker.
After parsing a Gospel annotated OCaml file and creating an untyped Gospel AST we go through each signature and perform a series of intermediate checks, these include:
- Name resolution: In Gospel (like in OCaml), the name we use to reference a variable depends on where we reference it. For example, if we define a function [f] in a module [M], within that module, that function's name is [f], but outside of that module it is [M.f]. If we [open] the module, both names are valid. Since Inferno does not support any of these mechanisms, we must take care to tag every variable with a unique identifier. In our previous example, regardless of what name we use to reference [f], each instance of that variable will be tagged with the same identifier. This means when we build our constraints, it won't matter that the name of the variable is context dependent, since Inferno will only look at the unique identifier. A consequence of this is that that there should be no unbound variables when we build the Inferno constraint, since to tag every variable we must keep track of what variables have been defined.
- Infix operators: During parsing, we use the [Tinfix] constructor to mark a chain of infix operators. After parsing however, this constructor is redundant since during typechecking, it is equivalent to two nested [Tapply] constructors. Therefore, to simplify building the Inferno constraint, we remove this constructor from all terms and replace it with [Tapply]. Although this step could be done in [Checker], it makes more sense to do it during this phase.
- Duplicate names: There are some contexts in Gospel where we are not allowed to introduce the same variable twice into scope (e.g. duplicate function arguments). Since Inferno has no way to track this natively, we must handle this manually.
After these checks are complete, we create a copy of the signature where every identifier has been replaced with a uniquely tagged variable. If the signature has any Gospel terms, it is fed into the Inferno solver.
Design of the solver
Note: this section assumes some familiarity with Inferno
Most of the solver is pretty standard: we traverse the Gospel
signature, take care to put every variable in scope using def
constraints (meaning all local variables are created using a
monomorphic scheme) and using the hastype
function to infer the types
of each sub term. There are, nevertheless, a few features that are not
directly supported by Inferno that we had to implement ourselves:
Named type variables: When a user gives a name to a Gospel type variable it is considered rigid, meaning it cannot be unified with any other type. This means that Gospel functions such as:
function x : 'a = 0
Are ill typed since the rigid type variable
'a
cannot be unified withinteger
. Although Inferno does have enough infrastructure that we could easily introduce aforall
combinator that would create a rigid type variable, we went with a different approach. When we encounter a type annotation, we transform it into a deep type (an encoding provided by Inferno to turn types into Inferno variables). When we encounter a named type variable, instead of turning it into adeep
variable that can be unified with other types, we create aTvar
structure which can only be unified with otherTvar
with the same identifier.The reason why we don't use the rigid variables provided by Inferno is because of error messages. When we turn a Gospel type variable into an Inferno variable, there is no way to associate it with its name. This means when the Inferno variable is decoded back into a Gospel type variable, Inferno does no provide us with a way to give it its original name. This is problematic because it would mean error messages could not display the correct Gospel type. For example:
function f (s1 : 'a sequence) (s2 : 'b sequence) = s1 = s2
Should produce the message "mismatch between the type
'a sequence
and'b sequence
". However, there is no way for Inferno to get back to the original names'a
and'b
.Note: This works because we only use
def
constraints to introduce new variable names which means that each variable has a monomorphic type scheme, if we switch to a polymorphic type scheme, we probably need to use Inferno's rigid type variables.Top level definitions: if we feed the following signature into Inferno
axiom eq : x = y
it will raise anUnbound
exception, since the namesx
andy
are not defined within the scope of the axiom. However, it is possibly that the namesx
andy
were defined previously in the top level. Since the Inferno solver cannot be used in such a way that it "remembers" previous constraints it solve, we mark variables aslocal
during name resolution if they are defined in the term. When we are building a constraint, if we encounter a variablex
that is notlocal
, we go through the environment we create during name resolution and find the type ofx
.When we turn the type of
x
into a deep type, its type variables are turned into flexible Inferno variables using theexist
binder, meaning they can be unified with other types. This ensures that top level definitions are let polymorphic.Note: although it is technically possible to map a list of top level signatures to a sequence of nested
def
constraints, this is not worth it since it leads to a very messy implementation where the complexity added is much greater than simply keeping track of the types of top level signatures.
DONE Inferno Efficiency
- State "DONE" from "๐๏ธ๐๐๏ธ"
One possible downside is that I'm not sure with regards with
efficiency how Inferno behaves. More specifically, when does Inferno
populate the data structures it needs for type unification? Is it when
the constraint is built or when solve
is called? If the later, there
are some real questions with regards to efficiency
๐๏ธ๐๐๏ธ Gospel specifications in OCaml AST
In the design document Nicolas mentioned that Gospel specifications should be inserted directly in the OCaml AST instead of in a copy of it. I don't know how that would work in practice.
DONE Recursive flag in OCaml AST for types
- State "DONE" from "๐๏ธ๐๐๏ธ"
In the OCaml AST, types are always marked as recursive, I don't understand why.
Answer: turns out OCaml has a nonrec
annotation that allows you to
specify that a type is non recursive.