helium/COMPILERDOCS.md at master · Helium4Haskell/helium

The basics

In this short manual, we describe how the Helium compiler can be used, and in particular, describe the parameters that can be passed to Helium 1.7 (there are some notable differences with earlier versions of Helium).

For purposes of experimentation, start off by creating a file, say Simple.hs and enter:

module Simple where

main  = 2 + 4

When you run helium from the prompt without any parameter you obtain a short description of the most often used parameters (and it indicates that an error has occurred, because helium really expects a module to be compiled; from now on I shall omit this part of the message).

Error in invocation: the name of the module to be compiled seems to be missing.

Helium compiler 1.8.1 (... CEST 2015)
Usage: helium [options] file [options]
  -b          --build                 recompile module even if up to date
  -B          --build-all             recompile all modules even if up to date
  -i          --dump-information      show information about this module
  -I          --dump-all-information  show information about all imported modules
              --enable-logging        enable logging, overrides previous disable-logging
              --disable-logging       disable logging (default), overrides previous enable-logging flags
  -a MESSAGE  --alert`MESSAGE         compiles with alert flag in logging; MESSAGE specifies the reason for the alert.
              --overloading           turn overloading on (default), overrides all previous no-overloading flags
              --no-overloading        turn overloading off, overrides all previous overloading flags
  -P PATH     --lvmpath`PATH          use PATH as search path
  -v          --verbose               show the phase the compiler is in
  -w          --no-warnings           do notflag warnings
  -X          --moreoptions           show more compiler options
              --info`NAME             display information about NAME

The compiler lists its most important options and its version number. To compile the program you write:

helium Simple.hs

which typically will result in:

Compiling Simple.hs
(3,1): Warning: Missing type signature: main :: Int
Compilation successful with 1 warning

The default use (although it depends on whether the file hint.conf was modified at any time) of helium allows overloading (ad hoc polymorphism) by means of type classes (like Haskell 98, but in a restricted form, read here), so that you can use the same symbol (``) to compare values of different types. Because of the complications that type classes may give when it comes to error messages, novice programmers can disallow overloading (pass --no-overloading along with the options). To compile the `Simple` module without overloading you write:

helium --no-overloading -B Simple.hs

The --no-overloading flag does as told, and we'll get to the -B option later.

This brings us to a number of useful things to know about the Helium compiler. The first is that it compiles .hs files to .lvm files that can then be run with an interpreter, actually one of two: lvmrun and runhelium. In practice, the latter of the two is preferred since you do not need to pass any information about where to find some Helium's libraries. If you are familiar with programming Java, then helium is like javac and lvmrun and runhelium are like java. If you want to run the result of the previous non-overloaded compilation, you write:

runhelium --no-overloading Simple.lvm

This should print 6 in your terminal. If you forget to pass --no-overloading then you probably get no output. Generally, the behaviour of the interpreter is undefined/unknown.

The interpreter lvmrun is more flexible than runhelium, so there might be situations where you prefer to use lvmrun. For example, lvmrun supports many parameters (run it without any to see what they are) but they are typically of a rather technical nature, and we will not look into that here further.

Another thing to note, is that when you run helium on a correct Haskell source file twice in succession, without changing it in the meantime, then it will say:

Simple is up to date

The reason is that helium sees that the .hs file is older than the corresponding .lvm file, and therefore sees no need to compile it. However, when we recompiled Simple.hs earlier, the file had not changed, but we wanted to recompile it with different parameters. This is why we passed along the -B option (-b would have worked as well in this case). When passed, the compiler does not perform the check to see if recompilation is necessary, it simply recompiles. The additional feature of -B over -b, is that when your source module also imports other modules, then under -B these will also be recompiled, but not if you only pass -b.

The Helium compiler tends to be something of a wise guy. For example, in our compilation, it continues to complain that main does not have a type signature, and it even tells you what it is, by means of a warning that it provides. To get rid of this and all other possible warnings, write:

helium -b -w Simple.hs

We strongly advise against turning the warnings off, and we also advise to write a type signature for every top-level definition. Other useful warnings that helium gives are when you shadow an identifier (e.g., defining a local variable with exactly the same name as a top-level identifier), when you introduce variables that you never use, or when you omit a case in a pattern match. Many of these warnings point to real bugs in your code.

For example when you modify Simple.hs like so:

module Simple where

empty []  = 1
emty  xs  = 0

main  = 2 + 4

and run

helium -b Simple.hs

then you obtain:

Compiling Simple.hs
(4,7): Warning: Variable "xs" is not used
(3,1): Warning: Missing type signature: empty :: [a] -> Int
(3,1): Warning: Missing pattern in function bindings: 
  empty (_ : _) = ...
(4,1): Warning: Missing type signature: emty :: a -> Int
(6,1): Warning: Missing type signature: main :: Int
Compilation successful with 5 warnings

Note how the compiler implicitly informs us that we mispelled empty, because it indicates that we forgot a signature for both empty and emty and it says that we omitted the cons pattern case for empty. It also says that xs is never used, but in this case that happened to be our intention.

Now, if we change the definition of main to main = 2 ++ 4, then we obtain:

Compiling Simple.hs
(4,7): Warning: Variable "xs" is not used
(6,11): Type error in variable
 expression       : ++
   type           : [a] -> [a] -> [a]
   expected type  : Int -> Int -> b  
 probable fix     : use + instead

Compilation failed with 1 error

In other words, errors take precedence over warnings (in most cases). Only after you fix the problem (by deleting the extra +) will the warnings come back.

If helium seems to behave erratically (you pass parameters to it, but it does not seem to listen) or you simply want to know what it is up to, then pass the -v flag. First off, it will show you what settings have been made from the command line, and it will show the passes in the compiler so that if things go wrong you get a first indication of where things go wrong. For example, running helium -P /usr/local/helium/lib/ -b -v Simple.hs gives:

Options after simplification:
--lvmpath`"/usr/local/helium/lib/"
--disable-logging
--overloading
--build
--verbose
Argument: Just "Simple.hs"
Prelude is up to date
Compiling Simple.hs
Lexing...
Parsing...
Importing...
Resolving operators...
Static checking...
(4,7): Warning: Variable "xs" is not used
Type inference directives...
Type inferencing...
(6,11): Type error in variable
 expression       : ++
   type           : [a] -> [a] -> [a]
   expected type  : Int -> Int -> b  
 probable fix     : use + instead

Compilation failed with 1 error

The first line shows which additional path is used to search for lvm files, that logging is disabled, overloading is on, the file must be built, even if it already happens to seem up-to-date, and that the compilation is verbose (duh). (Note that even if you pass in the abbreviated form of the commands, the long version of the compiler option is displayed.) Then the compilation process starts as usual, but this time the compiler lists the phases that it is in.

Note that the compiler says Prelude is up to date. Why is that? Haskell uses a Prelude that contains many often used function definitions, like map and such. These definitions also need to be compiled, and since the Prelude is always implicitly imported, the compiler mentions that it is already up to date and does not need to be recompiled. Typically, the Prelude will only be compiled the first time you run the compiler.

To continue, the type error message hints at replacing ++ with + to fix the problem. This is indeed one of the strong points of helium, that it can add this kind of informed hints to the error messages it provides.

You might wonder what happens if you write:

helium -b -v --no-overloading --overloading Simple.hs

In the case of the overloading flags, the last one wins. This rule also applies to other toggle/boolean flags such as --enable-logging_ and --disable-logging. The rule does not apply to the -P option: every occurrence of the -P option adds to the list of known lvm paths. So you can write:

helium -P /usr/local/helium/lib -P . -b -v --no-overloading --overloading Simple.hs

and it will look for additional .lvm files in both . and /usr/local/helium/lib. Note that the space between the -P and the path is optional. If you separate them by a space, then you can still use tab-completion to look for the right directory (if you happen to use a terminal that supports tab-completion, of course).

There is one complication with having two modes for compilation: if you run an lvm file compiled with, say, overloading turned on and run with the libraries with overloading turned off, then you may obtain some obscure error messages like the following:

exception: unable to load module "Prelude":
  module doesn't export symbol "$dictNumInt".

The solution is to force recompilation of your source modules (the criterium helium uses is whether the .lvm file is newer than the source, not that you pass different parameters).

The logging facility and the alert flag

One of the more innovative features is that every compile can be logged by the helium compiler to a (logging) server. We have such a server, implemented in Java, that can be obtained from us on demand; it is not part of the standard Helium system. The idea of logging is that enough information is sent to the server, to be able to recreate exactly the logged compilation. This amounts to sending by means of socket communication, the module, the modules it imports (and so on), a description of the version of the compiler, and the parameters that were passed to the compiler. More details can be found in the technical report on the Helium logger in the [publications section]Publications). Our purpose in logging Helium programs was to be able, at some point, to validate the work we had done on improving type error messages. Since then we have come to the conclusion that loggings can be used for many other purposes besides. One of these directly relates to a flag of the compiler as of version 1.7: the --alert flag (or -a). Consider the following program:

main = 2 +. 4

The output with overloading turned on, was at some point that it suggested to replace +. with ++. A user might consider this to be a bug, because replacing +. with ++ will result in a type error. He can alert the person who manages the server by redoing the compilation and adding the alert flag. Even if logging is turned off by default, the compiler will attempt to log the compile this time, and will also make sure that that logging is tagged in a way to make it easy for the person who manages the logger to find out whether somebody tried to make him aware of certain things. Besides alerting us to (seemingly) bad error messages, the facility can also be used to alert the manager to particularly good or clever error messages (by all means, do!).

To come back to the "bug" that the programmer thought he had observed. Although his reasoning is correct, and we would prefer the type inferencer to come up with a better suggestion, the type inferencer never comes into play, because +. is not an operator in overloaded mode (it happens to be an operator in non-overloaded mode). What happens instead is that when you write an identifier/operator that is not in scope, the compiler will look for identifiers and operators with a similar name. He will suggest possible replacements, unless these replacements are very short identifiers (length less than or equal to one; this is why + is not suggested as a replacement). In recent versions of Helium this has been changed somewhat: the rule is currently that we report all identifiers that differ in at most one position from the unknown identifier, unless there are too many candidates that qualify. Then no suggestion is made. In the example this results in the following:

Compiling Simple.hs
(6,11): Undefined variable "+."
  Hint: Did you mean "+", "++" or "." ?
Compilation failed with 1 error

Advanced flags

Although we have not discussed all the flags that are listed when you run helium, the flags we discussed thus far are likely to be used the most. In this section, we consider the few remaining flags from the standard help screen, and then move on to the real advanced flags. The -i (--dump-information) and -I (--dump-all-information) flags are mainly used by the interpreters. By means of these flags, the interpreters can retrieve the identifiers provided/defined and imported by a source file, e.g., when you invoke the browse option :b in one of the interpreters. Note that if you already compiled the sources, and did not change them, you need to pass -b in addition to -i or -I to get this information. You can also request information about a particular identifier by passing the option --infoNAME`. It will tell then you where in the source file a particular identifier was defined, and what its type is.

Running

helium -X

actually gives us a whole host of other flags to pass to the compiler (all those listed after --infoNAME`).

Error in invocation: the name of the module to be compiled seems to be missing.

Helium compiler ...
Usage: helium [options] file [options]
  -b          --build                        recompile module even if up to date
  -B          --build-all                    recompile all modules even if up to date
  -i          --dump-information             show information about this module
  -I          --dump-all-information         show information about all imported modules
              --enable-logging               enable logging, overrides previous disable-logging
              --disable-logging              disable logging (default), overrides previous enable-logging flags
  -a MESSAGE  --alert`MESSAGE                compiles with alert flag in logging; MESSAGE specifies the reason for the alert.
              --overloading                  turn overloading on (default), overrides all previous no-overloading flags
              --no-overloading               turn overloading off, overrides all previous overloading flags
  -P PATH     --lvmpath`PATH                 use PATH as search path
  -v          --verbose                      show the phase the compiler is in
  -w          --no-warnings                  do notflag warnings
  -X          --moreoptions                  show more compiler options
              --info`NAME                    display information about NAME
  -1          --stop-after-parsing           stop after parsing
  -2          --stop-after-static-analysis   stop after static analysis
  -3          --stop-after-type-inferencing  stop after type inferencing
  -4          --stop-after-desugaring        stop after desugaring into Core
  -t          --dump-tokens                  dump tokens to screen
  -u          --dump-uha                     pretty print abstract syntax tree
  -c          --dump-core                    pretty print Core program
  -C          --save-core                    write Core program to file
              --debug-logger                 show logger debug information
              --host`HOST                    specify which HOST to use for logging (default helium.zoo.cs.uu.nl)
              --port`PORT                    select the PORT number for the logger (default: 5010)
  -d          --type-debug                   debug constraint-based type inference
  -W          --algorithm-w                  use bottom-up type inference algorithm W
  -M          --algorithm-m                  use folklore top-down type inference algorithm M
              --no-directives                disable type inference directives
              --no-repair-heuristics         don't suggest program fixes
              --H-fullqualification          to determine fully qualified names for Holmes

The options -1, -2, -3, and -4 can be used to terminate the compilation process before code is generated. Typically, these are used by people trying to debug helium. The options -t, -u, -c are used for similar purposes: the compilation process does not transform a program directly to lvm code. Instead, a program is first desugared to core (a slightly sugared lambda calculus), and then transformed to lvm code. If you want to inspect this intermediate code, you can use the -C option.

An important debug option is -d, because it shows the internals of the most innovative part of the compiler: it's type inferencer. By default, the compiler use a fast and simple type inferencer that solves constraints as it goes. When it finds an error, however, it reconsiders the part of the program where inference failed. It then uses a type graph data structure to discover what might have caused the error. This data structure allows the use of various heuristics that can help decide the cause of the error, and may suggest program fixes to overcome the problem.

The -d option gives a lot of information to show which constraints are to be solved, which heuristics have been applied, whether they made a difference, and so on. To be able to observe the difference between using our type graph solver and Damas-Milner's algorithm W or the folklore algorithm M, you can explicitly tell the compiler to use it (by passing -W or -M). Furthermore, you can instruct the compiler to never suggest a program fix with --no-repair-heuristics, or to ignore type inference directives.

One of the problems with the logging facility is that it, necessarily, communicates with the outside world. For various reasons, the logging facility may not be working right (there might be an unexpected firewall, the server might not be turned on, the chosen port number might be taken by another process, and so on). Therefore, a special debug flag has been provided to get more information about what the logger is doing, --debug-logger.

While we are on the subject of logging: up to version 1.6, helium had the name of the computer that runs the logging server and the port to which the compiler should connect, hardwired into it. Not anymore. You can change the default logger server host and the port number to whatever you like (but you might want to experiment a bit with debugging turned on when you do). The default host to which the programs are logged can found by looking at the defaults for the flags host and port when you run helium -X. Note that internal errors of the compiler are also logged to this combination of host and port.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The basics

The logging facility and the alert flag

Advanced flags

FilesExpand file tree

COMPILERDOCS.md

Latest commit

History

COMPILERDOCS.md

File metadata and controls

The basics

The logging facility and the alert flag

Advanced flags