Haskell 98 Syntax

[pattern]	optional
{pattern}	zero or more repetitions
(pattern)	grouping
pat₁ \| pat₂	choice
pat_<pat'>	difference---elements generated by pat
	except those generated by pat'
`fibonacci`	terminal syntax in typewriter font

BNF-like syntax is used throughout, with productions having the form:

nonterm -> alt₁ | alt₂ | ... | alt_n

There are some families of nonterminals indexed by precedence levels (written as a superscript). Similarly, the nonterminals op, varop, and conop may have a double index: a letter l, r, or n for left-, right- or nonassociativity and a precedence level. A precedence-level variable i ranges from 0 to 9; an associativity variable a varies over {l, r, n}. Thus, for example

aexp -> ( expⁱ⁺¹ qop^(a,i) )

actually stands for 30 productions, with 10 substitutions for i and 3 for a.

In both the lexical and the context-free syntax, there are some ambiguities that are to be resolved by making grammatical phrases as long as possible, proceeding from left to right (in shift-reduce parsing, resolving shift/reduce conflicts by shifting). In the lexical syntax, this is the "maximal munch" rule. In the context-free syntax, this means that conditionals, let-expressions, and lambda abstractions extend to the right as far as possible.

B.2 Lexical Syntax


program	`->`	{lexeme \| whitespace }
lexeme	`->`	qvarid \| qconid \| qvarsym \| qconsym
	`\|`	literal \| special \| reservedop \| reservedid
literal	`->`	integer \| float \| char \| string
special	`->`	`(` \| `)` \| `,` \| `;` \| `[` \| `]` \| `\| `{` \| `}`
whitespace	`->`	whitestuff {whitestuff}
whitestuff	`->`	whitechar \| comment \| ncomment
whitechar	`->`	newline \| vertab \| space \| tab \| uniWhite
newline	`->`	return linefeed \| return \| linefeed \| formfeed
return	`->`	a carriage return
linefeed	`->`	a line feed
vertab	`->`	a vertical tab
formfeed	`->`	a form feed
space	`->`	a space
tab	`->`	a horizontal tab
uniWhite	`->`	any Unicode character defined as whitespace
comment	`->`	dashes [ any_<symbol> {any}] newline
dashes	`->`	`--` {`-`}
opencom	`->`	`{-`
closecom	`->`	`-}`
ncomment	`->`	opencom ANYseq {ncomment ANYseq}closecom
ANYseq	`->`	{ANY}_{<{ANY}( opencom \| closecom ) {ANY}>}
ANY	`->`	grahic \| whitechar
any	`->`	graphic \| space \| tab
graphic	`->`	small \| large \| symbol \| digit \| special \| `:` \| `"` \| `'`
small	`->`	ascSmall \| uniSmall \| `_`
ascSmall	`->`	`a` \| `b` \| ... \| `z`
uniSmall	`->`	any Unicode lowercase letter
large	`->`	ascLarge \| uniLarge
ascLarge	`->`	`A` \| `B` \| ... \| `Z`
uniLarge	`->`	any uppercase or titlecase Unicode letter
symbol	`->`	ascSymbol \| uniSymbol_{< special \| _ \| : \| " \| ' >}
ascSymbol	`->`	`!` \| `#` \| `$` \| `%` \| `&` \| `*` \| `+` \| `.` \| `/` \| `<` \| `=` \| `>` \| `?` \| @
	`\|`	`\` \| `^` \| `\|` \| `-` \| `~`
uniSymbol	`->`	any Unicode symbol or punctuation
digit	`->`	ascDigit \| uniDigit
ascDigit	`->`	`0` \| `1` \| ... \| `9`
uniDigit	`->`	any Unicode decimal digit
octit	`->`	`0` \| `1` \| ... \| `7`
hexit	`->`	digit \| `A` \| ... \| `F` \| `a` \| ... \| `f`


varid	`->`	(small {small \| large \| digit \| `'` })_<reservedid>
conid	`->`	large {small \| large \| digit \| `'` }
reservedid	`->`	`case` \| `class` \| `data` \| `default` \| `deriving` \| `do` \| `else`
	`\|`	`if` \| `import` \| `in` \| `infix` \| `infixl` \| `infixr` \| `instance`
	`\|`	`let` \| `module` \| `newtype` \| `of` \| `then` \| `type` \| `where` \| `_`
varsym	`->`	( symbol {symbol \| `:`})_{<reservedop \| dashes>}
consym	`->`	(`:` {symbol \| `:`})_<reservedop>
reservedop	`->`	`..` \| `:` \| `::` \| `=` \| `\` \| `\|` \| `<-` \| `->` \| `@` \| `~` \| `=>`
varid			(variables)
conid			(constructors)
tyvar	`->`	varid	(type variables)
tycon	`->`	conid	(type constructors)
tycls	`->`	conid	(type classes)
modid	`->`	conid	(modules)
qvarid	`->`	[ modid `.` ] varid
qconid	`->`	[ modid `.` ] conid
qtycon	`->`	[ modid `.` ] tycon
qtycls	`->`	[ modid `.` ] tycls
qvarsym	`->`	[ modid `.` ] varsym
qconsym	`->`	[ modid `.` ] consym
decimal	`->`	digit{digit}
octal	`->`	octit{octit}
hexadecimal	`->`	hexit{hexit}
integer	`->`	decimal
	`\|`	`0o` octal \| `0O` octal
	`\|`	`0x` hexadecimal \| `0X` hexadecimal
float	`->`	decimal `.` decimal [exponent]
	`\|`	decimal exponent
exponent	`->`	(`e` \| `E`) [`+` \| `-`] decimal
char	`->`	`'` (graphic_{<' \| \>} \| space \| escape_<\&>) `'`
string	`->`	`"` {graphic_{<" \| \>} \| space \| escape \| gap}`"`
escape	`->`	`\` ( charesc \| ascii \| decimal \| `o` octal \| `x` hexadecimal )
charesc	`->`	`a` \| `b` \| `f` \| `n` \| `r` \| `t` \| `v` \| `\` \| `"` \| `'` \| `&`
ascii	`->`	`^`cntrl \| `NUL` \| `SOH` \| `STX` \| `ETX` \| `EOT` \| `ENQ` \| `ACK`
	`\|`	`BEL` \| `BS` \| `HT` \| `LF` \| `VT` \| `FF` \| `CR` \| `SO` \| `SI` \| `DLE`
	`\|`	`DC1` \| `DC2` \| `DC3` \| `DC4` \| `NAK` \| `SYN` \| `ETB` \| `CAN`
	`\|`	`EM` \| `SUB` \| `ESC` \| `FS` \| `GS` \| `RS` \| `US` \| `SP` \| `DEL`
cntrl	`->`	ascLarge \| @ \| `[` \| `\` \| `]` \| `^` \| `_`
gap	`->`	`\` whitechar {whitechar}`\`

B.3 Layout

Section 2.7 gives an informal discussion of the layout rule. This section defines it more precisely.

The meaning of a Haskell program may depend on its layout. The effect of layout on its meaning can be completely described by adding braces and semicolons in places determined by the layout. The meaning of this augmented program is now layout insensitive.

The effect of layout is specified in this section by describing how to add braces and semicolons to a laid-out program. The specification takes the form of a function L that performs the translation. The input to L is:

The "indentation" of a lexeme is the column number of the first character of that lexeme; the indentation of a line is the indentation of its leftmost lexeme. To determine the column number, assume a fixed-width font with the following conventions:

delivers a layout-insensitive translation of tokens, where tokens is the result of lexically analysing a module and adding column-number indicators to it as described above. The definition of L is as follows, where we use ":" as a stream construction operator, and "[]" for the empty stream.

L (<n>:ts) (m:ms) = ; : (L ts (m:ms)) if m = n
= } : (L (<n>:ts) ms) if n < m
L (<n>:ts) ms = L ts ms
L ({n}:ts) (m:ms) = { : (L ts (n:m:ms)) if n > m (Note 1)
L ({n}:ts) [] = { : (L ts [n]) if n > 0 (Note 1)
L ({n}:ts) ms = { : } : (L (<n>:ts) ms) (Note 2)
L (}:ts) (0:ms) = } : (L ts ms) (Note 3)
L (}:ts) ms = parse-error (Note 3)
L ({:ts) ms = { : (L ts (0:ms)) (Note 4)
L (t:ts) (m:ms) = } : (L (t:ts) ms) if m /= 0 and parse-error(t)
(Note 5)
L (t:ts) ms = t : (L ts ms)
L [] [] = []
L [] (m:ms) = } : L [] ms if m /=0 (Note 6)

If none of the rules given above matches, then the algorithm fails. It can fail for instance when the end of the input is reached, and a non-layout context is active, since the close brace is missing. Some error conditions are not detected by the algorithm, although they could be: for example let }.

Note 1 implements the feature that layout processing can be stopped prematurely by a parse error. For example let x = e; y = x in e'is valid, because it translates to let { x = e; y = x } in e'The close brace is inserted due to the parse error rule above. The parse-error rule is hard to implement in its full generality, because doing so involves fixities. For example, the expression do a == b == chas a single unambiguous (albeit probably type-incorrect) parse, namely (do { a == b }) == cbecause (==) is non-associative. Programmers are therefore advised to avoid writing code that requires the parser to insert a closing brace in such situations.

B.4 Literate comments

The "literate comment" convention, first developed by Richard Bird and Philip Wadler for Orwell, and inspired in turn by Donald Knuth's "literate programming", is an alternative style for encoding Haskell source code. The literate style encourages comments by making them the default. A line in which ">" is the first character is treated as part of the program; all other lines are comment.

The program text is recovered by taking only those lines beginning with ">", and replacing the leading ">" with a space. Layout and comments apply exactly as described in Appendix B in the resulting text.

To capture some cases where one omits an ">" by mistake, it is an error for a program line to appear adjacent to a non-blank comment line, where a line is taken as blank if it consists only of whitespace.

By convention, the style of comment is indicated by the file extension, with ".hs" indicating a usual Haskell file and ".lhs" indicating a literate Haskell file. Using this style, a simple factorial program would be: This literate program prompts the user for a number and prints the factorial of that number: > main :: IO () > main = do putStr "Enter a number: " > l <- readLine > putStr "n!= " > print (fact (read l)) This is the factorial function. > fact :: Integer -> Integer > fact 0 = 1 > fact n = n * fact (n-1)

An alternative style of literate programming is particularly suitable for use with the LaTeX text processing system. In this convention, only those parts of the literate program that are entirely enclosed between \begin{code}...\end{code} delimiters are treated as program text; all other lines are comment. More precisely:

B.5 Context-Free Syntax


exports	`->`	`(` export₁ `,` ... `,` export_n [ `,` ] `)`	(n>=0)
export	`->`	qvar
	`\|`	qtycon [`(..)` \| `(` cname₁ `,` ... `,` cname_n `)`]	(n>=0)
	`\|`	qtycls [`(..)` \| `(` qvar₁ `,` ... `,` qvar_n `)`]	(n>=0)
	`\|`	`module` modid


impdecl	`->`	`import` [`qualified`] modid [`as` modid] [impspec]
	`\|`		(empty declaration)
impspec	`->`	`(` import₁ `,` ... `,` import_n [ `,` ] `)`	(n>=0)
	`\|`	`hiding` `(` import₁ `,` ... `,` import_n [ `,` ] `)`	(n>=0)
import	`->`	var
	`\|`	tycon [ `(..)` \| `(` cname₁ `,` ... `,` cname_n `)`]	(n>=0)
	`\|`	tycls [`(..)` \| `(` var₁ `,` ... `,` var_n `)`]	(n>=0)
cname	`->`	var \| con

topdecls	`->`	topdecl₁ `;` ... `;` topdecl_n	(n>=0)
topdecl	`->`	`type` simpletype `=` type
	`\|`	`data` [context `=>`] simpletype `=` constrs [deriving]
	`\|`	`newtype` [context `=>`] simpletype `=` newconstr [deriving]
	`\|`	`class` [scontext `=>`] tycls tyvar [`where` cdecls]
	`\|`	`instance` [scontext `=>`] qtycls inst [`where` idecls]
	`\|`	`default` `(`type₁ `,` ... `,` type_n`)`	(n>=0)
	`\|`	decl


decls	`->`	`{` decl₁ `;` ... `;` decl_n `}`	(n>=0)
decl	`->`	gendecl
	`\|`	(funlhs \| pat⁰) rhs
cdecls	`->`	`{` cdecl₁ `;` ... `;` cdecl_n `}`	(n>=0)
cdecl	`->`	gendecl
	`\|`	(funlhs \| var) rhs
idecls	`->`	`{` idecl₁ `;` ... `;` idecl_n `}`	(n>=0)
idecl	`->`	(funlhs \| var) rhs
	`\|`		(empty)
gendecl	`->`	vars `::` [context `=>`] type	(type signature)
	`\|`	fixity [integer] ops	(fixity declaration)
	`\|`		(empty declaration)
ops	`->`	op₁ `,` ... `,` op_n	(n>=1)
vars	`->`	var₁ `,` ...`,` var_n	(n>=1)
fixity	`->`	`infixl` \| `infixr` \| `infix`


type	`->`	btype [`->` type]	(function type)
btype	`->`	[btype] atype	(type application)
atype	`->`	gtycon
	`\|`	tyvar
	`\|`	`(` type₁ `,` ... `,` type_k `)`	(tuple type, k>=2)
	`\|`	`[` type `]`	(list type)
	`\|`	`(` type `)`	(parenthesized constructor)
gtycon	`->`	qtycon
	`\|`	`()`	(unit type)
	`\|`	`[]`	(list constructor)
	`\|`	`(->)`	(function constructor)
	`\|`	`(,`{`,`}`)`	(tupling constructors)
context	`->`	class
	`\|`	`(` class₁ `,` ... `,` class_n `)`	(n>=0)
class	`->`	qtycls tyvar
	`\|`	qtycls `(` tyvar atype₁ ... atype_n `)`	(n>=1)
scontext	`->`	simpleclass
	`\|`	`(` simpleclass₁ `,` ... `,` simpleclass_n `)`	(n>=0)
simpleclass	`->`	qtycls tyvar

simpletype	`->`	tycon tyvar₁ ... tyvar_k	(k>=0)
constrs	`->`	constr₁ `\|` ... `\|` constr_n	(n>=1)
constr	`->`	con [`!`] atype₁ ... [`!`] atype_k	(arity con = k, k>=0)
	`\|`	(btype \| `!` atype) conop (btype \| `!` atype)	(infix conop)
	`\|`	con `{` fielddecl₁ `,` ... `,` fielddecl_n `}`	(n>=0)
newconstr	`->`	con atype
	`\|`	con `{` var `::` type `}`
fielddecl	`->`	vars `::` (type \| `!` atype)
deriving	`->`	`deriving` (dclass \| `(`dclass₁`,` ... `,` dclass_n`)`)	(n>=0)
dclass	`->`	qtycls

inst	`->`	gtycon
	`\|`	`(` gtycon tyvar₁ ... tyvar_k `)`	(k>=0, tyvars distinct)
	`\|`	`(` tyvar₁ `,` ... `,` tyvar_k `)`	(k>=2, tyvars distinct)
	`\|`	`[` tyvar `]`
	`\|`	`(` tyvar₁ `->` tyvar₂ `)`	tyvar₁ and tyvar₂ distinct


funlhs	`->`	var apat {apat }
	`\|`	patⁱ⁺¹ varop^(a,i) patⁱ⁺¹
	`\|`	lpatⁱ varop^(l,i) patⁱ⁺¹
	`\|`	patⁱ⁺¹ varop^(r,i) rpatⁱ
	`\|`	`(` funlhs `)` apat {apat }
rhs	`->`	`=` exp [`where` decls]
	`\|`	gdrhs [`where` decls]
gdrhs	`->`	gd `=` exp [gdrhs]
gd	`->`	`\|` exp⁰

exp	`->`	exp⁰ `::` [context `=>`] type	(expression type signature)
	`\|`	exp⁰
expⁱ	`->`	expⁱ⁺¹ [qop^(n,i) expⁱ⁺¹]
	`\|`	lexpⁱ
	`\|`	rexpⁱ
lexpⁱ	`->`	(lexpⁱ \| expⁱ⁺¹) qop^(l,i) expⁱ⁺¹
lexp⁶	`->`	`-` exp⁷
rexpⁱ	`->`	expⁱ⁺¹ qop^(r,i) (rexpⁱ \| expⁱ⁺¹)
exp¹⁰	`->`	`\` apat₁ ... apat_n `->` exp	(lambda abstraction, n>=1)
	`\|`	`let` decls `in` exp	(let expression)
	`\|`	`if` exp `then` exp `else` exp	(conditional)
	`\|`	`case` exp `of` `{` alts `}`	(case expression)
	`\|`	`do` `{` stmts `}`	(do expression)
	`\|`	fexp
fexp	`->`	[fexp] aexp	(function application)

aexp	`->`	qvar	(variable)
	`\|`	gcon	(general constructor)
	`\|`	literal
	`\|`	`(` exp `)`	(parenthesized expression)
	`\|`	`(` exp₁ `,` ... `,` exp_k `)`	(tuple, k>=2)
	`\|`	`[` exp₁ `,` ... `,` exp_k `]`	(list, k>=1)
	`\|`	`[` exp₁ [`,` exp₂] `..` [exp₃] `]`	(arithmetic sequence)
	`\|`	`[` exp `\|` qual₁ `,` ... `,` qual_n `]`	(list comprehension, n>=1)
	`\|`	`(` expⁱ⁺¹ qop^(a,i) `)`	(left section)
	`\|`	`(` lexpⁱ qop^(l,i) `)`	(left section)
	`\|`	`(` qop^(a,i)_<-> expⁱ⁺¹ `)`	(right section)
	`\|`	`(` qop^(r,i)_<-> rexpⁱ `)`	(right section)
	`\|`	qcon `{` fbind₁ `,` ... `,` fbind_n `}`	(labeled construction, n>=0)
	`\|`	aexp_<qcon> `{` fbind₁ `,` ... `,` fbind_n `}`	(labeled update, n >= 1)


qual	`->`	pat `<-` exp	(generator)
	`\|`	`let` decls	(local declaration)
	`\|`	exp	(guard)
alts	`->`	alt₁ `;` ... `;` alt_n	(n>=0)
alt	`->`	pat `->` exp [`where` decls]
	`\|`	pat gdpat [`where` decls]
	`\|`		(empty alternative)
gdpat	`->`	gd `->` exp [ gdpat ]
stmts	`->`	stmt₁ ... stmt_n exp [`;`]	(n>=0)
stmt	`->`	exp `;`
	`\|`	pat `<-` exp `;`
	`\|`	`let` decls `;`
	`\|`	`;`	(empty statement)
fbind	`->`	qvar `=` exp

pat	`->`	var `+` integer	(successor pattern)
	`\|`	pat⁰
patⁱ	`->`	patⁱ⁺¹ [qconop^(n,i) patⁱ⁺¹]
	`\|`	lpatⁱ
	`\|`	rpatⁱ
lpatⁱ	`->`	(lpatⁱ \| patⁱ⁺¹) qconop^(l,i) patⁱ⁺¹
lpat⁶	`->`	`-` (integer \| float)	(negative literal)
rpatⁱ	`->`	patⁱ⁺¹ qconop^(r,i) (rpatⁱ \| patⁱ⁺¹)
pat¹⁰	`->`	apat
	`\|`	gcon apat₁ ... apat_k	(arity gcon = k, k>=1)


apat	`->`	var [`@` apat]	(as pattern)
	`\|`	gcon	(arity gcon = 0)
	`\|`	qcon `{` fpat₁ `,` ... `,` fpat_k `}`	(labeled pattern, k>=0)
	`\|`	literal
	`\|`	`_`	(wildcard)
	`\|`	`(` pat `)`	(parenthesized pattern)
	`\|`	`(` pat₁ `,` ... `,` pat_k `)`	(tuple pattern, k>=2)
	`\|`	`[` pat₁ `,` ... `,` pat_k `]`	(list pattern, k>=1)
	`\|`	`~` apat	(irrefutable pattern)
fpat	`->`	qvar `=` pat


gcon	`->`	`()`
	`\|`	`[]`
	`\|`	`(,`{`,`}`)`
	`\|`	qcon
var	`->`	varid \| `(` varsym `)`	(variable)
qvar	`->`	qvarid \| `(` qvarsym `)`	(qualified variable)
con	`->`	conid \| `(` consym `)`	(constructor)
qcon	`->`	qconid \| `(` gconsym `)`	(qualified constructor)
varop	`->`	varsym \| `varid `	(variable operator)
qvarop	`->`	qvarsym \| `qvarid `	(qualified variable operator)
conop	`->`	consym \| `conid `	(constructor operator)
qconop	`->`	gconsym \| `qconid `	(qualified constructor operator)
op	`->`	varop \| conop	(operator)
qop	`->`	qvarop \| qconop	(qualified operator)
gconsym	`->`	`:` \| qconsym


module	`->`	`module` modid [exports] `where` body
	`\|`	body
body	`->`	`{` impdecls `;` topdecls `}`
	`\|`	`{` impdecls `}`
	`\|`	`{` topdecls `}`
impdecls	`->`	impdecl₁ `;` ... `;` impdecl_n	(n>=1)

B Syntax Reference

B.1 Notational Conventions

B.2 Lexical Syntax

B.3 Layout

B.4 Literate comments

B.5 Context-Free Syntax

L (<n>:ts) (m:ms)	=	`;` : (L ts (m:ms))	if m = n
	=	`}` : (L (<n>:ts) ms)	if n < m
L (<n>:ts) ms	=	L ts ms
L ({n}:ts) (m:ms)	=	`{` : (L ts (n:m:ms))	if n > m (Note 1)
L ({n}:ts) []	=	`{` : (L ts [n])	if n > 0 (Note 1)
L ({n}:ts) ms	=	`{` : `}` : (L (<n>:ts) ms)	(Note 2)
L (`}`:ts) (0:ms)	=	`}` : (L ts ms)	(Note 3)
L (`}`:ts) ms	=	parse-error	(Note 3)
L (`{`:ts) ms	=	`{` : (L ts (0:ms))	(Note 4)
L (t:ts) (m:ms)	=	`}` : (L (t:ts) ms)	if m /= 0 and parse-error(t)
			(Note 5)
L (t:ts) ms	=	t : (L ts ms)
L [] []	=	[]
L [] (m:ms)	=	`}` : L [] ms	if m /=0 (Note 6)