The syntax of a language is described by a grammar, grammars must be precise and unambiguous so there is no possibility of misinterpretation.
BNF consists of a set of rules called production rules, each rule has the form:
<symbol> := <symbol1>|<symbol2><symbol3>| C<symbol>
Parts of a BNF production rule:
:= meaning "is defined as" | meaning "or" < > angle brackets used to surround symbols.The angle brackets distinguish syntax rule symbols (also called non-terminal symbols) from terminal symbols which are written exactly as they are to be represented. A BNF rule defining a nonterminal has the form:
nonterminal := sequence_of_alternatives consisting of strings of terminals or nonterminals separated by the meta-symbol |For example, a BNF production rule for a mini-language might be:
<program> := program <declaration_sequence> begin <statements_sequence> end ;This shows that a mini-language program consists of the keyword "program" followed by the declaration sequence, then the keyword "begin" and the statements sequence, finally the keyword "end" and a semicolon.
recursive rules are used to define sequences. '..' may be used to simplify long lists of terminal symbols.
<letter> := a..z|A..Z <digit> := 0..9 <alnum> := <letter>|<digit> <id> := <letter>|<id><alnum>This says that an id is a single letter or a letter followed by any number of letters or digits.
Now here is the definition of BNF expressed in BNF: ( non terminals are shown in italics and terminals in bold underlined to avoid confusion)
<rule> := <<id>> := <expression>
<expression> := <term> | <term> | <expression>
<term> := <<id>> | <terminal> | <term><<id>> | <term><terminal>BNF is not only important for describing syntax rules in books, it is very commonly used (with variants) by syntactic tools. See for example any book on LEX and YACC, the standard UNIX parser generators.
<rule> := <id> --> <expression>
<expression> := <term> | <term> | <expression>
<term> := <id> | [<terminal>] | <term><id> | <term>[<terminal>]The BNF rules for a simplified C language:
<program> := <decls>|<functiondef>|<program><decls>| <program><functiondef> <all_stmts> := <stmtlist><stop_stmt>|<stmtlist> <stmtlist> := <stmtlist><stmt>|<stmt>
<stmt> := <compstmt>|<expr>';'| if(<condexpr>)<stmt>| if(<condexpr>)<stmt>else<stmt>| while(<condexpr>)<stmt>| do<stmt>while(condexpr);| for(<exprseq>;<condexpr>;<exprseq>)<stmt>| switch(val)<casestmts>|;
<compsmt> := {<all_stmts>}|
{<decls><all_stmts>}
<type> := int|char|unsigned|unsigned char|
float|short|long|unsigned short|
unsigned long|double|<usertype>|
struct <structname> {<decls>}|
<type> *
<typedef> := typedef <type> <usertype>
<functiondef> := <type> <id>(<args>)<compsmt>| <type> <id>()<compsmt>
<args> := <type> <id>|<args>,<type> <id>
<decls> := <type> <id>;|<type> <id>;<decls>
<exprseq> := <expr>|<exprseq>,<expr>
<stop_stmt> := break;|continue;| return;|return <val>;|return(<val>);
<casestmts> := {<cases><default>}
<cases> := <cases><case>|<case>
<case> := case <const>:<stmtlist><stop_stmt>
<default> := default:<stmtlist><stop_stmt>
<expr> := <rhs>|<modify_expr>
<call_expr> := <id>(<arglist>)
<arglist> := <arglist>,<val>|<val>
<modify_expr> := <varname>=<rhs>|*<id>=<rhs>
<rhs> := <binary_expr>|<unary_expr>
<unary_expr> := <simp_expr>|*<id>|&<varname>| <call_expr>|<unop><val>|(<type>)<varname>
<binary_expr> := <val><binop><val> <unop> := +|-|~|!
<binop> := -|+|/|*|%|&|&&||||||^
<relop> : <|<=|>|>=
<condexpr> := <val>|<val><relop><val>
<simp_expr> := <varname>|<const> <varname> := <idlist>.<id>|<id>
<arrayref> := <id><reflist>
<reflist> := [<val>]|<reflist>[<val>]
<idlist> := <id>|<idlist>.<id>