notes

Comparison of integer literal syntax

A comparison of the syntaxes for integer literals in various programming languages.

The sign (i.e., - or sometimes +) is often considered an operator and not documented in the grammars with integer literals, so it is not included here.

This focuses on grammars. For history and context, read the Integer Literals section in Dennie Van Tassel’s History and comparison of programming languages. Pascal Rigaux’s Syntax across languages also lists more languages than here.

Summary

Language Bases Decimal prefix Binary prefix Octal prefix Hex prefix Leading zero Separator Leading sep Trailing sep Repeated sep Suffix
Ada 2–16 ””, 10# 2# 8# 16# Decimal _ No No No Exponent
C23 2, 8, 10, 16 ”” 0b, 0B 0 0x, 0X Octal ' No No No Type
C89–C17 8, 10, 16 ”” N/A 0 0x, 0X Octal N/A N/A N/A N/A Type
C# 2, 10, 16 ”” 0b, 0B N/A 0x, 0X Decimal _ Yes No Yes Type
Erlang 2–36 ””, 10# 2# 8# 16# Decimal _ No No No N/A
Go 1.13+ 2, 8, 10, 16 ”” 0b, 0B 0, 0o, 0O 0x, 0X Octal _ Yes No No N/A
Go ≤1.12 8, 10, 16 ”” N/A 0 0x, 0X Octal N/A N/A N/A N/A N/A
Haskell with extensions 2, 8, 10, 16 ”” 0b, 0B 0o, 0O 0x, 0X Decimal _ Yes No Yes N/A
Haskell 1.3+ 8, 10, 16 ”” N/A 0o, 0O 0x, 0X Decimal N/A N/A N/A N/A N/A
Haskell 1.0–1.2 10 ”” N/A N/A N/A Decimal N/A N/A N/A N/A N/A
Java 7+ 2, 8, 10, 16 ”” 0b, 0B 0 0x, 0X Octal _ No No Yes Type
Java ≤6 8, 10, 16 ”” N/A 0 0x, 0X Octal N/A N/A N/A N/A Type
JSON 10 ”” N/A N/A N/A Illegal N/A N/A N/A N/A Exponent
Python 3.6+ 2, 8, 10, 16 ”” 0b, 0B 0o, 0O 0x, 0X Illegal _ Yes No No N/A
Python 3.0–3.5 2, 8, 10, 16 ”” 0b, 0B 0o, 0O 0x, 0X Illegal N/A N/A N/A N/A N/A
Python 2.6–2.7 2, 8, 10, 16 ”” 0b, 0B 0, 0o, 0O 0x, 0X Octal N/A N/A N/A N/A Type
Python ≤2.5 8, 10, 16 ”” N/A 0 0x, 0X Octal N/A N/A N/A N/A Type
Ruby 2, 8, 10, 16 ””, 0d, 0D 0b, 0B 0, 0o, 0O 0x, 0X Octal _ No No No N/A
Rust 2, 8, 10, 16 ”” 0b 0o 0x Decimal _ Yes Yes Yes Type
Rust 0.1–0.8 2, 10, 16 ”” 0b N/A 0x Decimal _ Yes Yes Yes Type
Visual Basic 15.5 2, 8, 10, 16 ”” &B, &b &O, &o &H, &h Decimal _ Yes No Yes Type
Visual Basic 15.0 2, 8, 10, 16 ”” &B, &b &O, &o &H, &h Decimal _ No No Yes Type
Visual Basic 7.0 8, 10, 16 ”” N/A &O, &o &H, &h Decimal N/A N/A N/A N/A Type
YAML 1.2 8, 10, 16 ”” N/A 0o 0x Decimal N/A N/A N/A N/A N/A
YAML 1.1 2, 8, 10, 16, 60 ”” 0b 0 0x Decimal _ Yes Yes Yes N/A
Zig 0.8+ 2, 8, 10, 16 ”” 0b 0o 0x Decimal _ No No No N/A
Zig ≤0.7 2, 8, 10, 16 ”” 0b 0o 0x Decimal N/A N/A N/A N/A N/A

Shared definitions:

dec_digit       ::= [0-9]
bin_digit       ::= [0-1]
oct_digit       ::= [0-7]
hex_digit       ::= [0-9 a-f A-F]

C-style

C

C23

integer_literal ::= (dec_literal | bin_literal | oct_literal | hex_literal) integer_suffix?
dec_literal     ::= [1-9] ("'"? dec_digit)*
bin_literal     ::= "0" [bB] bin_digit ("'"? bin_digit)*
oct_literal     ::= "0" ("'"? oct_digit)*
hex_literal     ::= "0" [xX] hex_digit ("'"? hex_digit)*
integer_suffix  ::= unsigned_suffix (long_suffix | long_long_suffix | bit_precise_int_suffix)?
                  | (long_suffix | long_long_suffix | bit_precise_int_suffix) unsigned_suffix?
unsigned_suffix ::= [uU]
long_suffix     ::= [lL]
long_long_suffix ::= "ll" | "LL"
bit_precise_int_suffix ::= "wb" | "WB"

From §6.4.4.1 Integer constants in the C Standard as of N3096 (2023-04-02).

C99, C11, and C17

integer_literal ::= (dec_literal | oct_literal | hex_literal) integer_suffix?
dec_literal     ::= [1-9] dec_digit*
oct_literal     ::= "0" oct_digit*
hex_literal     ::= "0" [xX] hex_digit+
integer_suffix  ::= unsigned_suffix (long_suffix | long_long_suffix)?
                  | (long_suffix | long_long_suffix) unsigned_suffix?
unsigned_suffix ::= [uU]
long_suffix     ::= [lL]
long_long_suffix ::= "ll" | "LL"

From §6.4.4.1 Integer constants in the C Standard as of N1256 (2007-09-07), N1570 (2011-04-04), and N2310 (2018-11-11).

C89 and C90

C89 and C89 have identical integer literals to later versions through C17, except for not having a suffix for long long:

integer_suffix  ::= unsigned_suffix long_suffix? | long_suffix unsigned_suffix?
unsigned_suffix ::= [uU]
long_suffix     ::= [lL]

From §3.1.3.2 Integer constants in the C89 draft and §6.1.3.2 Integer constants in the C89 standard.

C#

integer_literal ::= (dec_literal | bin_literal | hex_literal) integer_suffix?
dec_literal     ::= dec_digit ("_"* dec_digit)*
bin_literal     ::= ("0" [bB]) ("_"* bin_digit)+
hex_literal     ::= ("0" [xX]) ("_"* hex_digit)+
integer_suffix  ::= [Uu][Ll]? | [Ll][Uu]?

According to the language specification as of C# 7 and described informally in the language reference.

Go

Go 1.13+

integer_literal ::= dec_literal | bin_literal | oct_literal | hex_literal
dec_literal     ::= [1-9] ("_"? dec_digit)* | "0"
bin_literal     ::= "0" [bB] ("_"? bin_digit)+
oct_literal     ::= "0" [oO]? ("_"? oct_digit)+
hex_literal     ::= "0" [xX] ("_"? hex_digit)+

From the language specification as of Go 1.13 through 1.21.6.

Go ≤1.12

integer_literal ::= dec_literal | oct_literal | hex_literal
dec_literal     ::= [1-9] dec_digit*
oct_literal     ::= "0" oct_digit*
hex_literal     ::= "0" [xX] hex_digit+

From the language specification as of the initial commit on 2008-03-02 through Go 1.12.

Haskell

Haskell 1.3+

integer_literal ::= dec_literal | oct_literal | hex_literal
dec_literal     ::= dec_digit+
oct_literal     ::= "0" [oO] oct_digit+
hex_literal     ::= "0" [xX] hex_digit+

From the Haskell Report, as of Haskell 1.3, 1.4, 98, and 2010.

Haskell 1.0–1.2

integer_literal ::= dec_digit+

From the Haskell Report, as of Haskell 1.0, 1.1, and 1.2.

NumericUnderscores extension

integer_literal ::= dec_literal | oct_literal | hex_literal
dec_literal     ::= dec_digit ("_"* dec_digit)*
oct_literal     ::= "0" [oO] ("_"* oct_digit)+
hex_literal     ::= "0" [xX] ("_"* hex_digit)+

From the NumericUnderscores language extension.

BinaryLiterals extension

bin_literal     ::= "0" [bB] bin_digit+

It includes underscores, when NumericUnderscores is also enabled:

bin_literal     ::= "0" [bB] ("_"* bin_digit)+

From the BinaryLiterals language extension.

Java

Java 7+

integer_literal ::= (dec_literal | bin_literal | oct_literal | hex_literal) integer_suffix?
dec_literal     ::= [1-9] ("_"* dec_digit)* | "0"
bin_literal     ::= "0" [bB] bin_digit ("_"* bin_digit)*
oct_literal     ::= "0" ("_"* oct_digit)+
hex_literal     ::= "0" [xX] hex_digit ("_"* hex_digit)*
integer_suffix  ::= [lL]

From the The Java Language Specification as of Java SE 7 through 21.

Java ≤6

integer_literal ::= (dec_literal | oct_literal | hex_literal) integer_suffix?
dec_literal     ::= [1-9] dec_digit* | "0"
oct_literal     ::= "0" oct_digit+
hex_literal     ::= "0" [xX] hex_digit+
integer_suffix  ::= [lL]

From the The Java Language Specification as of the First Edition, Second Edition, Third Edition, and Java SE 6.

JSON

number_literal  ::= "-"? integer fraction? exponent?
integer         ::= [1-9] dec_digit* | "0"
fraction        ::= "." dec_digit+
exponent        ::= [eE] [-+]? dec_digit+

Specified in RFC 4627, RFC 7159, RFC 8259, ECMA-404 (1st and 2nd editions), and ISO/IEC 21778:2017.

Python

Python 3.6+

integer_literal ::= dec_literal | bin_literal | oct_literal | hex_literal
dec_literal     ::= [1-9] ("_"? dec_digit)* | "0" ("_"? "0")*
bin_literal     ::= "0" [bB] ("_"? bin_digit)+
oct_literal     ::= "0" [oO] ("_"? oct_digit)+
hex_literal     ::= "0" [xX] ("_"? hex_digit)+

From the language reference as of Python 3.6 through 3.12.1.

Python 3.0–3.5

integer_literal ::= dec_literal | bin_literal | oct_literal | hex_literal
dec_literal     ::= [1-9] dec_digit* | "0"+
bin_literal     ::= "0" [bB] bin_digit+
oct_literal     ::= "0" [oO] oct_digit+
hex_literal     ::= "0" [xX] hex_digit+

C-style octal numbers with a leading zero were allowed until Python 3.0.

From the language reference as of Python 3.0 through 3.5.

Python 2.6–2.7

integer_literal ::= (dec_literal | bin_literal | oct_literal | hex_literal) long_suffix?
dec_literal     ::= [1-9] dec_digit* | "0"
bin_literal     ::= "0" [bB] bin_digit+
oct_literal     ::= "0" [oO] oct_digit+ | "0" oct_digit+
hex_literal     ::= "0" [xX] hex_digit+
long_suffix     ::= [lL]

From the language reference as of Python 2.6 through 2.7.

Python ≤2.5

integer_literal ::= (dec_literal | oct_literal | hex_literal) long_suffix?
dec_literal     ::= [1-9] dec_digit* | "0"
oct_literal     ::= "0" oct_digit+
hex_literal     ::= "0" [xX] hex_digit+
long_suffix     ::= [lL]

From the language reference as of Python 1.4 through 2.5.

The language reference was first released with Python 1.4. Before that and as early as Python 0.9.1, the tokenizer matches this grammar.

Ruby

integer_literal ::= dec_literal | bin_literal | oct_literal | hex_literal
dec_literal     ::= [1-9] ("_"? dec_digit)* | "0" [dD] dec_digit ("_"? dec_digit)*
bin_literal     ::= "0" [bB] bin_digit ("_"? bin_digit)*
oct_literal     ::= "0" [oO] oct_digit ("_"? oct_digit)* | "0" ("_"? oct_digit)*
hex_literal     ::= "0" [xX] hex_digit ("_"? hex_digit)*

From the informal language documentation supplemented with the Ruby Spec Suite, revised 13 Nov 2023.

Rust

integer_literal ::= (dec_literal | bin_literal | oct_literal | hex_literal) integer_suffix?
dec_literal     ::= dec_digit (dec_digit | "_")*
bin_literal     ::= "0b" "_"* bin_digit (bin_digit | "_")*
oct_literal     ::= "0o" "_"* oct_digit (oct_digit | "_")*
hex_literal     ::= "0x" "_"* hex_digit (hex_digit | "_")*

The pattern integer_suffix has varied as integer types have been added:

From the language reference as of Rust 0.9 through 1.75.

Earlier Rust versions #### Rust 0.1–0.8 These versions had no octal literals, but were otherwise identical to later versions. ```bnf integer_literal ::= (dec_literal | bin_literal | hex_literal) integer_suffix? dec_literal ::= dec_digit (dec_digit | "_")* bin_literal ::= "0b" "_"* bin_digit (bin_digit | "_")* hex_literal ::= "0x" "_"* hex_digit (hex_digit | "_")* integer_suffix ::= "u8" | "u16" | "u32" | "u64" | "u" | "i8" | "i16" | "i32" | "i64" | "i" ``` From the language reference as of [Rust 0.3](https://doc.rust-lang.org/0.3/rust.html#number-literals) through [0.8](https://doc.rust-lang.org/0.8/rust.html#number-literals). The rustc and rustboot lexers matched this behavior back to [2010-07-27](https://github.com/rust-lang/rust/blob/80307576245aabf00285db020bbfbc4c3a891766/src/boot/fe/lexer.mll#L141-L145) in rust-lang/rust, 1.5 years before Rust 0.1. From [2010-07-01](https://github.com/rust-lang/rust/blob/afc0dc8bfcc5d6fba1e907ab35c110fc074cad67/src/boot/fe/lexer.mll#L123-L127) through [2010-07-27](https://github.com/rust-lang/rust/blob/6662aeb779d3e44886c466378578ebe1979de15a/src/boot/fe/lexer.mll#L124-L128) in rust-lang/rust, integer suffixes had not yet been added. #### Rust rustboot 2007–2010 These revisions had no suffixes or octal literals, and did not allow leading underscores due to a bug. ```bnf integer_literal ::= (dec_literal | bin_literal | hex_literal) dec_literal ::= dec_digit+ bin_literal ::= "0b" bin_digit (bin_digit | "_")* hex_literal ::= "0x" hex_digit (hex_digit | "_")* ``` From [2007-05-22](https://github.com/graydon/rust-prehistory/blob/aa2d738554e561d526809f3cba0fd643e3d12906/src/lexer.mll#L77-L79) in graydon/rust-prehistory through [2010-07-01](https://github.com/rust-lang/rust/blob/3aaff59dba4b9fff598c49eeb579cb6c631dd4f4/src/boot/fe/lexer.mll#L123-L126) in rust-lang/rust. #### Rust rustboot 2006–2007 These revisions had no suffixes, and did not allow leading underscores due to a bug. ```bnf integer_literal ::= (dec_literal | bin_literal | hex_literal) dec_literal ::= dec_digit+ bin_literal ::= "0b" bin_digit (bin_digit | "_")* oct_literal ::= "0o" oct_digit (oct_digit | "_")* hex_literal ::= "0x" hex_digit (hex_digit | "_")* ``` From the initial commit on [2006-07-23](https://github.com/graydon/rust-prehistory/blob/b0fd440798ab3cfb05c60a1a1bd2894e1618479e/src/lexer.mll#L66-L69) through [2007-05-22](https://github.com/graydon/rust-prehistory/blob/c1f80de7286b4268a4c1ebaddfa35cc2d7c57a4d/src/lexer.mll#L77-L80) in graydon/rust-prehistory.

YAML

YAML 1.2

integer_literal ::= dec_literal | oct_literal | hex_literal
dec_literal     ::= [-+]? dec_digit+
oct_literal     ::= "0o" oct_digit+
hex_literal     ::= "0x" hex_digit+

According to the patterns in Tag Resolution in the language specification as of versions 1.2.0, 1.2.1, and 1.2.2. The separate section on the int type was removed and this section is less specific, so some details may be missing.

YAML 1.1

Until YAML 1.2, YAML had sexagesimal (base 60) literals.

integer_literal ::= [-+]? (dec_literal | oct_literal | hex_literal | sex_literal)
dec_literal     ::= [1-9] (dec_digit | "_")* | "0"
bin_literal     ::= "0b" (bin_digit | "_")+
oct_literal     ::= "0" (oct_digit | "_")+
hex_literal     ::= "0x" (hex_digit | "_")+
sex_literal     ::= [1-9][0-9_]* (":" [0-5]?[0-9])+

The grammar allows for empty digits when using underscores. This may be a bug in the spec. An example has the decimal literal +12,345, but I assume this was a mistake.

Specified in the Integer Language-Independent Type specification for YAML 1.1. The YAML specification 1.0 does not have enough details to describe it, but it has the same examples for decimal, octal, hexadecimal, and sexagesimal as 1.1.

Zig

Zig 0.8+

integer_literal ::= dec_literal | bin_literal | oct_literal | hex_literal
dec_literal     ::= dec_digit ("_"? dec_digit)*
bin_literal     ::= "0b" bin_digit ("_"? bin_digit)*
oct_literal     ::= "0o" oct_digit ("_"? oct_digit)*
hex_literal     ::= "0x" hex_digit ("_"? hex_digit)*

Defined in the Zig grammar from Zig 0.8.0 through 0.11.0.

Zig ≤0.7

integer_literal ::= dec_literal | bin_literal | oct_literal | hex_literal
dec_literal     ::= dec_digit+
bin_literal     ::= "0b" bin_digit+
oct_literal     ::= "0o" oct_digit+
hex_literal     ::= "0x" hex_digit+

Defined in the Zig grammar from Zig 0.4.0 through 0.7.1 and documented by example from 0.1.0 through 0.3.0.

Ada-style

Ada

Ada supports integer literals of any base from 2 to 16, with the form base#value#exponent.

integer_literal ::= base_2_literal | … | base_16_literal
dec_literal     ::= dec_digit ("_"? dec_digit)* exponent?
base_2_literal  ::= "0"* "2#" [0-1] ("_"? [0-1])* "#" exponent?
base_3_literal  ::= "0"* "3#" [0-2] ("_"? [0-2])* "#" exponent?
…
base_10_literal ::= "0"* "10#" [0-9] ("_"? [0-9])* "#" exponent?
base_11_literal ::= "0"* "11#" [0-9 a A] ("_"? [0-9 a A])* "#" exponent?
base_12_literal ::= "0"* "12#" [0-9 a-b A-B] ("_"? [0-9 a-b A-B])* "#" exponent?
…
base_16_literal ::= "0"* "16#" [0-9 a-f A-F] ("_"? [0-9 a-f A-F])* "#" exponent?
exponent        ::= [eE] [+-]? dec_literal

From §2.4.1 Decimal Literals and §2.4.2 Based Literals in the Language Reference Manual as of Ada 83, 95, 2005, 2012, and 2022 draft 35.

Erlang

Erlang supports integer literals of any base from 2 to 36, with the form base#value.

integer_literal ::= dec_literal | base_2_literal | … | base_36_literal
dec_literal     ::= dec_digit ("_"? dec_digit)*
base_2_literal  ::= "0"* "2#" [0-1] ("_"? [0-1])*
base_3_literal  ::= "0"* "3#" [0-2] ("_"? [0-2])*
…
base_10_literal ::= "0"* "10#" [0-9] ("_"? [0-9])*
base_11_literal ::= "0"* "11#" [0-9 a A] ("_"? [0-9 a A])*
base_12_literal ::= "0"* "12#" [0-9 a-b A-B] ("_"? [0-9 a-b A-B])*
…
base_36_literal ::= "0"* "36#" [0-9 a-z A-Z] ("_"? [0-9 a-z A-Z])*

From the reference manual as of Erlang 14.2.1.

Visual Basic

Visual Basic 15.5

VB 15.5 added leading digit separators.

integer_literal ::= (int_literal | bin_literal | oct_literal | hex_literal) integer_suffix?
dec_literal     ::= dec_digit ("_"* dec_digit)*
bin_literal     ::= ("&" [Bb]) ("_"* bin_digit)+
oct_literal     ::= ("&" [Oo]) ("_"* oct_digit)+
hex_literal     ::= ("&" [Hh]) ("_"* hex_digit)+
integer_suffix  ::= [Uu]?[Ss] | [Uu]?[Ii] | [Uu]?[Ll] | "%" | "&"

Noted in what’s new for Visual Basic 15.0.

Visual Basic 15.0

VB 15.0 added binary literals and digit separators.

integer_literal ::= (int_literal | bin_literal | oct_literal | hex_literal) integer_suffix?
dec_literal     ::= dec_digit ("_"* dec_digit)*
bin_literal     ::= ("&" [Bb]) bin_digit ("_"* bin_digit)*
oct_literal     ::= ("&" [Oo]) oct_digit ("_"* oct_digit)*
hex_literal     ::= ("&" [Hh]) hex_digit ("_"* hex_digit)*
integer_suffix  ::= [Uu]?[Ss] | [Uu]?[Ii] | [Uu]?[Ll] | "%" | "&"

Noted in what’s new for Visual Basic 15.0.

Visual Basic 8.0–11.0

VB 8.0 added unsigned types.

integer_literal ::= (int_literal | oct_literal | hex_literal) integer_suffix?
dec_literal     ::= dec_digit+
oct_literal     ::= ("&" [Oo]) oct_digit+
hex_literal     ::= ("&" [Hh]) hex_digit+
integer_suffix  ::= [Uu]?[Ss] | [Uu]?[Ii] | [Uu]?[Ll] | "%" | "&"

Decimal literals represent the signed decimal value of the integral literal, whereas octal and hexadecimal literals represent the unsigned binary value.

The suffix variants stand for the types Short and UShort, Integer and UInteger, Long and ULong, Integer, and Long, respectively. When no type suffix is specified, the type is Integer if it is in the range of Integer, Long if in the range of Long, or otherwise a compile-time error.

According to the language specification for Visual Basic 8.0–11.0.

Visual Basic 7.0–7.1

integer_suffix  ::= [Ss] | [Ii] | [Ll] | "%" | "&"

According to the language specification for Visual Basic 7.0–7.1.