OCaml Compiler - before May 2021

Hi Discuss,

I'm happy to introduce the first issue of the "OCaml compiler development newsletter". I asked frequent contributors to the OCaml compiler codebase to write a small burb on what they have been doing recently, in the interest of sharing more information on what people are interested in, looking at and working on.

This is by no means exhaustive: many people didn't end up having the time to write something, and it's fine. But hopefully this can give a small window on development activity related to the OCaml compiler, structured differently from the endless stream of Pull Requests on the compiler codebase.

(This initiative is inspired by the excellent Multicore newsletter. Please don't expect that it will be as polished or consistent :yo-yo: .)

Note:

  • Feel free of course to comment or ask questions, but I don't know if the people who wrote a small blurb will be looking at the thread, so no promises.

  • If you have been working on the OCaml compiler and want to say something, please feel free to post! If you would like me to get in touch next time I prepare a newsletter issue (some random point in the future), please let me know by email at (gabriel.scherer at gmail).


@dra27 (David Allsopp)

Compiler relocation patches now exist. There's still a few left to write, and they need splitting into reviewable PRs, but the core features are working. A compiler installation can be copied to a new location and still work, meaning that local switches in opam may in theory be renamed and, more importantly, we can cache previously-built compilers in an opam root to allow a new switch's compiler to be a copy. This probably won't be reviewed in time for 4.13, although it's intended that once merged opam-repository will carry back-ports to earlier compilers.

A whole slew of scripting pain has lead to some possible patches to reduce the use of scripts in the compiler build to somewhat closer to none.

FlexDLL bootstrap has been completely overhauled, reducing build time considerably. This will be in 4.13 (#10135)

@nojb (Nicolás Ojeda Bär)

I am working on #10159, which enables debug information in -output-complete-exe binaries. It uses incbin under Unix-like system and some other method under Windows.

@gasche (Gabriel Scherer)

I worked on bringing more PRs to a decision (merge or close). The number of open PRs has gone from 220-ish to 180, which feels nice.

I have also contributed to @Ekdohibs' project camlboot, which is a "bootstrap-free" implementation of OCaml able to compile the OCaml compiler itself. It currently targets OCaml 4.07 for various reasons. We were able to do a full build of the OCaml compiler, and check that the result produces bootstrap binaries that coincide with upstream bootstraps. This gives extremely strong confidence that the OCaml bootstrap is free from "trusting trust" attacks. For more details, see our draft paper.

with @Octachron (Florian Angeletti)

I worked with Florian Angeletti on deprecating certain command-line warning-specifier sequences, to avoid usability issues with (new in 4.12) warning names. Before -w -partial-match disables warning 4, but -w -partial is interpreted as the sequence w -p -w a -w r -w t -w i -w a -w l, most of which are ignored but -w a silences all warnings. Now multi-letter sequences of "unsigned" specifiers (-p is signed, a is unsigned) are deprecated. (We first deprecated all unsigned specifiers, but Leo White tested the result and remarked that -w A is common, so now we only warn on multi-letter sequences of unsigned specifiers.

I am working with @Octachron (Florian Angeletti) on grouping signature items when traversing module signatures. Some items are "ghost items" that are morally attached in a "main item"; the code mostly ignores this and this creates various bugs in corner cases. This is work that Florian started in September 2019 with #8929, to fix a bug in the reprinting of signatures. I only started reviewing in May-September 2020 and we decided to do sizeable changes, he split it in several smaller changes in January 2021 and we merged it in April 2021. Now we are looking are fixing other bugs with his code (#9774, #10385). Just this week Florian landed a nice PR fixing several distinct issues related to signature item grouping: #10401.

@xavierleroy (Xavier Leroy)

I fixed #10339, a mysterious crash on the new Macs with "Apple silicon". This was due to a ARM (32 and 64 bits)-specific optimization of array bound checking, which was not taken into account by the platform-independent parts of the back-end, leading to incorrect liveness analysis and wrong register allocation. #10354 fixes this by informing the platform-independent parts of the back-end that some platform-specific instructions can raise. In passing, it refactors similar code that was duplicating platform-independent calculations (of which instructions are pure) in platform-dependent files.

I spent quality time with the Jenkins continuous integration system at Inria, integrating a new Mac Mini M1. For unknown reasons, Jenkins ran the CI script in x86-64 emulation mode, so we were building and testing an x86-64 version of OCaml instead of the intended ARM64 version. A bit of scripting later (8b1bc01c3) and voilà, arm64-macos is properly tested as part of our CI.

Currently, I'm reading the "safe points" proposal by Sadiq Jaffer (#10039) and the changes on top of this proposed by Damien Doligez. It's a necessary step towards Multicore OCaml, so we really need to move forward on this one. It's a nontrivial change involving a new static analysis and a number of tweaks in every code emitter, but things are starting to look good here.

@mshinwell (Mark Shinwell)

I did a first pass of review on the safe points PR (#10039) and significantly simplified the proposed backend changes. I've also been involved in discussions about a new function-level attribute to cause an error if safe points (including allocations) might exist within a function's body, to make code that currently assumes this robust. There will be a design document for this coming in due course.

I fixed the random segfaults that were occurring on the RISC-V Inria CI worker (#10349).

In Flambda 2 land we spent two person-days debugging a problem relating to Infix_tag! We discovered that the code in OCaml 4.12 onwards for traversing GC roots in static data ("caml_globals") is not correct if any of the roots are closures. This arises in part because the new compaction code (#9728) has a hidden invariant: it must not see any field of a static data root more than once (not even via an Infix_tag). As far as we know, these situations do not arise in the existing compiler, although we may propose a patch to guard against them. They arise with Flambda 2 because in order to compile statically-allocated inconstant closures (ones whose environment is partially or wholly computed at runtime) we register closures directly as global roots, so we can patch their environments later.

@garrigue (Jacques Garrigue)

I have been working on a number of PRs fixing bugs in the type system, which are now merged:

  • #10277 fixes a theoretical bug in the principality of GADT type inference (#10383 applies only in -principal mode)
  • #10308 fixes an interaction between local open in patterns and the new syntax for introducing existential type variables
  • #10322 is an internal change using a normal reference inside of a weak one for backtracking; the weak reference was an optimization when backtracking was a seldom used feature, and was not useful anymore
  • #10344 fixes a bug in the delaying of the evaluation of optional arguments
  • #10347 cleans up some code in the unification algorithm, after a strengthening of universal variable scoping
  • #10362 fixes a forgotten normalization in the type checking algorithm

Some are still in progress:

  • #10348 improves the way expansion is done during unification, to avoid some spurious GADT related ambiguity errors
  • #10364 changes the typing of the body of the cases of pattern-matchings, allowing to warn in some non-principal situations; it also uncovered a number of principality related bugs inside the the type-checker

Finally, I have worked with Takafumi Saikawa (@t6s) on making the representation of types closer to its logical meaning, by ensuring that one always manipulate a normalized view in #10337 (large change, evaluation in progress).

@let-def (Frédéric Bour)

For some time, I have been working on new approaches to generate error messages from a Menhir parser.

My goal at the beginning was to detect and produce a precise message for the ‘let ;’ situation:

let x = 5;
let y = 6
let z = 7

LR detects an error at the third ‘let’ which is technically correct, although we would like to point the user at the ‘;’ which might be the root cause of the error. This goal has been achieved, but the prototype is far from being ready for production.

The main idea to increase the expressiveness and maintainability of error context identification is to use a flavor of regular expressions. The stack of a parser defines a prefix of a sentential form. Our regular expressions are matched against it. Internal details of the automaton does not leak (no reference to states), the regular language is defined by the grammar alone. With appropriate tooling, specific situations can be captured by starting from a coarse expression and refining it to narrow down the interesting cases.

Now I am focusing on one specific point of the ‘error message’ development pipeline: improving the efficiency of ‘menhir --list-errors’. This command is used to enumerate sentences that cover all erroneous situations (as defined by the LR grammar). On my computer and with the OCaml grammar, it takes a few minutes and quite a lot of RAM. Early results are encouraging and I hope to have a PR for Menhir soon. The performance improvement we are aiming for is to make the command almost real time for common grammars and to tackle bigger grammars by reducing the memory needs. For instance, in the OCaml case, the runtime is down from 3 minutes to 2–3 seconds and memory consumption goes from a few GiB down to 200 MiB.