A concrete subclass of
NumberFormat that formats decimal numbers. It
has a variety of features designed to make it possible to parse and format
numbers in any locale, including support for Western, Arabic, or Indic
digits. It also supports different flavors of numbers, including integers
("123"), fixed-point numbers ("123.4"), scientific notation ("1.23E4"),
percentages ("12%"), and currency amounts ("$123"). All of these flavors can
be easily localized.
This is an enhanced version of
DecimalFormat that is based on
the standard version in the RI. New or changed functionality is labeled
NEW.
To obtain a
NumberFormat for a specific locale (including the default
locale), call one of
NumberFormat's factory methods such as
NumberFormat.getInstance. Do not call the
DecimalFormatconstructors directly, unless you know what you are doing, since the
NumberFormat factory methods may return subclasses other than
DecimalFormat. If you need to customize the format object, do
something like this:
NumberFormat f = NumberFormat.getInstance(loc);
if (f instanceof DecimalFormat) {
((DecimalFormat)f).setDecimalSeparatorAlwaysShown(true);
}
Patterns
A
DecimalFormat consists of a pattern and a set of
symbols. The pattern may be set directly using
#applyPattern(String), or indirectly using other API methods which
manipulate aspects of the pattern, such as the minimum number of integer
digits. The symbols are stored in a
DecimalFormatSymbols object. When
using the
NumberFormat factory methods, the pattern and symbols are
read from ICU's locale data.
Special Pattern Characters
Many characters in a pattern are taken literally; they are matched during
parsing and are written out unchanged during formatting. On the other hand,
special characters stand for other characters, strings, or classes of
characters. For example, the '#' character is replaced by a localized digit.
Often the replacement character is the same as the pattern character; in the
U.S. locale, the ',' grouping character is replaced by ','. However, the
replacement is still happening, and if the symbols are modified, the grouping
character changes. Some special characters affect the behavior of the
formatter by their presence; for example, if the percent character is seen,
then the value is multiplied by 100 before being displayed.
To insert a special character in a pattern as a literal, that is, without any
special meaning, the character must be quoted. There are some exceptions to
this which are noted below.
The characters listed here are used in non-localized patterns. Localized
patterns use the corresponding characters taken from this formatter's
DecimalFormatSymbols object instead, and these characters lose their
special status. Two exceptions are the currency sign and quote, which are not
localized.
Symbol |
Location |
Localized? |
Meaning |
0 |
Number |
Yes |
Digit. |
@ |
Number |
No |
NEW Significant
digit. |
# |
Number |
Yes |
Digit, leading zeroes are not shown. |
. |
Number |
Yes |
Decimal separator or monetary decimal separator. |
- |
Number |
Yes |
Minus sign. |
, |
Number |
Yes |
Grouping separator. |
E |
Number |
Yes |
Separates mantissa and exponent in scientific notation.
Does not need to be quoted in prefix or suffix. |
+ |
Exponent |
Yes |
NEW Prefix
positive exponents with localized plus sign.
Does not need to be quoted in prefix or suffix. |
; |
Subpattern boundary |
Yes |
Separates positive and negative subpatterns. |
% |
Prefix or suffix |
Yes |
Multiply by 100 and show as percentage. |
\u2030 (
\u005Cu2030) |
Prefix or suffix |
Yes |
Multiply by 1000 and show as per mille. |
\u00A4 (
\u005Cu00A4) |
Prefix or suffix |
No |
Currency sign, replaced by currency symbol. If doubled, replaced by
international currency symbol. If present in a pattern, the monetary decimal
separator is used instead of the decimal separator. |
' |
Prefix or suffix |
No |
Used to quote special characters in a prefix or suffix, for example,
"'#'#" formats 123 to
"#123". To create a single quote
itself, use two in a row:
"# o''clock". |
* |
Prefix or suffix boundary |
Yes |
NEW Pad escape,
precedes pad character. |
A
DecimalFormat pattern contains a positive and negative subpattern,
for example, "#,##0.00;(#,##0.00)". Each subpattern has a prefix, a numeric
part and a suffix. If there is no explicit negative subpattern, the negative
subpattern is the localized minus sign prefixed to the positive subpattern.
That is, "0.00" alone is equivalent to "0.00;-0.00". If there is an explicit
negative subpattern, it serves only to specify the negative prefix and
suffix; the number of digits, minimal digits, and other characteristics are
ignored in the negative subpattern. This means that "#,##0.0#;(#)" produces
precisely the same result as "#,##0.0#;(#,##0.0#)".
The prefixes, suffixes, and various symbols used for infinity, digits,
thousands separators, decimal separators, etc. may be set to arbitrary
values, and they will appear properly during formatting. However, care must
be taken that the symbols and strings do not conflict, or parsing will be
unreliable. For example, either the positive and negative prefixes or the
suffixes must be distinct for
#parse to be able to distinguish
positive from negative values. Another example is that the decimal separator
and thousands separator should be distinct characters, or parsing will be
impossible.
The grouping separator is a character that separates clusters of
integer digits to make large numbers more legible. It is commonly used for
thousands, but in some locales it separates ten-thousands. The grouping
size
is the number of digits between the grouping separators, such as 3 for
"100,000,000" or 4 for "1 0000 0000". There are actually two different
grouping sizes: One used for the least significant integer digits, the
primary grouping size, and one used for all others, the
secondary grouping size. In most locales these are the same, but
sometimes they are different. For example, if the primary grouping interval
is 3, and the secondary is 2, then this corresponds to the pattern
"#,##,##0", and the number 123456789 is formatted as "12,34,56,789". If a
pattern contains multiple grouping separators, the interval between the last
one and the end of the integer defines the primary grouping size, and the
interval between the last two defines the secondary grouping size. All others
are ignored, so "#,##,###,####", "###,###,####" and "##,#,###,####" produce
the same result.
Illegal patterns, such as "#.#.#" or "#.###,###", will cause
DecimalFormat to throw an
IllegalArgumentException with a
message that describes the problem.
Pattern BNF
pattern := subpattern (';' subpattern)?
subpattern := prefix? number exponent? suffix?
number := (integer ('.' fraction)?) | sigDigits
prefix := '\\u0000'..'\\uFFFD' - specialCharacters
suffix := '\\u0000'..'\\uFFFD' - specialCharacters
integer := '#'* '0'* '0'
fraction := '0'* '#'
sigDigits := '#'* '@' '@'* '#'
exponent := 'E' '+'? '0'* '0'
padSpec := '*' padChar
padChar := '\\u0000'..'\\uFFFD' - quote
Notation:
X* 0 or more instances of X
X? 0 or 1 instances of X
X|Y either X or Y
C..D any character from C up to D, inclusive
S-T characters in S, except those in T
The first subpattern is for positive numbers. The second (optional)
subpattern is for negative numbers.
Not indicated in the BNF syntax above:
- The grouping separator ',' can occur inside the integer and sigDigits
elements, between any two pattern characters of that element, as long as the
integer or sigDigits element is not followed by the exponent element.
- NEW Two
grouping intervals are recognized: The one between the decimal point and the
first grouping symbol and the one between the first and second grouping
symbols. These intervals are identical in most locales, but in some locales
they differ. For example, the pattern "#,##,###" formats the number
123456789 as "12,34,56,789".
- NEW The pad
specifier
padSpec may appear before the prefix, after the prefix,
before the suffix, after the suffix or not at all.
Parsing
DecimalFormat parses all Unicode characters that represent decimal
digits, as defined by
Character#digit(int,int). In addition,
DecimalFormat also recognizes as digits the ten consecutive
characters starting with the localized zero digit defined in the
DecimalFormatSymbols object. During formatting, the
DecimalFormatSymbols-based digits are written out.
During parsing, grouping separators are ignored.
If
#parse(String,ParsePosition) fails to parse a string, it returns
null and leaves the parse position unchanged.
Formatting
Formatting is guided by several parameters, all of which can be specified
either using a pattern or using the API. The following description applies to
formats that do not use scientific notation or significant digits.
- If the number of actual integer digits exceeds the
maximum integer digits, then only the least significant digits
are shown. For example, 1997 is formatted as "97" if maximum integer digits
is set to 2.
- If the number of actual integer digits is less than the
minimum integer digits, then leading zeros are added. For
example, 1997 is formatted as "01997" if minimum integer digits is set to 5.
- If the number of actual fraction digits exceeds the maximum
fraction digits,
then half-even rounding is performed to the maximum fraction digits. For
example, 0.125 is formatted as "0.12" if the maximum fraction digits is 2.
- If the number of actual fraction digits is less than the
minimum fraction digits, then trailing zeros are added. For
example, 0.125 is formatted as "0.1250" if the minimum fraction digits is set
to 4.
- Trailing fractional zeros are not displayed if they occur j
positions after the decimal, where j is less than the maximum
fraction digits. For example, 0.10004 is formatted as "0.1" if the maximum
fraction digits is four or less.
Special Values
NaN is represented as a single character, typically
\u005cuFFFD. This character is determined by the
DecimalFormatSymbols object. This is the only value for which the
prefixes and suffixes are not used.
Infinity is represented as a single character, typically
\u005cu221E,
with the positive or negative prefixes and suffixes applied. The infinity
character is determined by the
DecimalFormatSymbols object.
Scientific Notation
Numbers in scientific notation are expressed as the product of a mantissa and
a power of ten, for example, 1234 can be expressed as 1.234 x 103.
The mantissa is typically in the half-open interval [1.0, 10.0) or sometimes
[0.0, 1.0), but it does not need to be.
DecimalFormat supports
arbitrary mantissas.
DecimalFormat can be instructed to use
scientific notation through the API or through the pattern. In a pattern, the
exponent character immediately followed by one or more digit characters
indicates scientific notation. Example: "0.###E0" formats the number 1234 as
"1.234E3".
- The number of digit characters after the exponent character gives the
minimum exponent digit count. There is no maximum. Negative exponents are
formatted using the localized minus sign, not the prefix and
suffix from the pattern. This allows patterns such as "0.###E0 m/s". To
prefix positive exponents with a localized plus sign, specify '+' between the
exponent and the digits: "0.###E+0" will produce formats "1E+1", "1E+0",
"1E-1", etc. (In localized patterns, use the localized plus sign rather than
'+'.)
- The minimum number of integer digits is achieved by adjusting the
exponent. Example: 0.00123 formatted with "00.###E0" yields "12.3E-4". This
only happens if there is no maximum number of integer digits. If there is a
maximum, then the minimum number of integer digits is fixed at one.
- The maximum number of integer digits, if present, specifies the exponent
grouping. The most common use of this is to generate engineering
notation,
in which the exponent is a multiple of three, e.g., "##0.###E0". The number
12345 is formatted using "##0.###E0" as "12.345E3".
- When using scientific notation, the formatter controls the digit counts
using significant digits logic. The maximum number of significant digits
limits the total number of integer and fraction digits that will be shown in
the mantissa; it does not affect parsing. For example, 12345 formatted with
"##0.##E0" is "12.3E3". See the section on significant digits for more
details.
- The number of significant digits shown is determined as follows: If no
significant digits are used in the pattern then the minimum number of
significant digits shown is one, the maximum number of significant digits
shown is the sum of the minimum integer and
maximum fraction digits, and it is unaffected by the maximum
integer digits. If this sum is zero, then all significant digits are shown.
If significant digits are used in the pattern then the number of integer
digits is fixed at one and there is no exponent grouping.
- Exponential patterns may not contain grouping separators.
NEW Significant
Digits
DecimalFormat has two ways of controlling how many digits are
shown: (a) significant digit counts or (b) integer and fraction digit counts.
Integer and fraction digit counts are described above. When a formatter uses
significant digits counts, the number of integer and fraction digits is not
specified directly, and the formatter settings for these counts are ignored.
Instead, the formatter uses as many integer and fraction digits as required
to display the specified number of significant digits.
Examples:
Pattern |
Minimum significant digits |
Maximum significant digits |
Number |
Output of format() |
@@@ | 3 |
3 |
12345 |
12300 |
@@@ |
3 |
3 |
0.12345 |
0.123 |
@@## |
2 |
4 |
3.14159 |
3.142 |
@@## |
2 |
4 |
1.23004 |
1.23 |
- Significant digit counts may be expressed using patterns that specify a
minimum and maximum number of significant digits. These are indicated by the
'@' and
'#' characters. The minimum number of significant
digits is the number of
'@' characters. The maximum number of
significant digits is the number of
'@' characters plus the number of
'#' characters following on the right. For example, the pattern
"@@@" indicates exactly 3 significant digits. The pattern
"@##" indicates from 1 to 3 significant digits. Trailing zero digits
to the right of the decimal separator are suppressed after the minimum number
of significant digits have been shown. For example, the pattern
"@##"formats the number 0.1203 as
"0.12".
- If a pattern uses significant digits, it may not contain a decimal
separator, nor the
'0' pattern character. Patterns such as
"@00" or
"@.###" are disallowed.
- Any number of
'#' characters may be prepended to the left of the
leftmost
'@' character. These have no effect on the minimum and
maximum significant digit counts, but may be used to position grouping
separators. For example,
"#,#@#" indicates a minimum of one
significant digit, a maximum of two significant digits, and a grouping size
of three.
- In order to enable significant digits formatting, use a pattern
containing the
'@' pattern character.
- In order to disable significant digits formatting, use a pattern that
does not contain the
'@' pattern character.
- The number of significant digits has no effect on parsing.
- Significant digits may be used together with exponential notation. Such
patterns are equivalent to a normal exponential pattern with a minimum and
maximum integer digit count of one, a minimum fraction digit count of the
number of '@' characters in the pattern - 1, and a maximum fraction digit
count of the number of '@' and '#' characters in the pattern - 1. For
example, the pattern
"@@###E0" is equivalent to
"0.0###E0".
- If significant digits are in use then the integer and fraction digit
counts, as set via the API, are ignored.
NEW Padding
DecimalFormat supports padding the result of
format to a
specific width. Padding may be specified either through the API or through
the pattern syntax. In a pattern, the pad escape character followed by a
single pad character causes padding to be parsed and formatted. The pad
escape character is '*' in unlocalized patterns. For example,
"$*x#,##0.00" formats 123 to
"$xx123.00", and 1234 to
"$1,234.00".
- When padding is in effect, the width of the positive subpattern,
including prefix and suffix, determines the format width. For example, in the
pattern
"* #0 o''clock", the format width is 10.
- The width is counted in 16-bit code units (Java
chars).
- Some parameters which usually do not matter have meaning when padding is
used, because the pattern width is significant with padding. In the pattern "
##,##,#,##0.##", the format width is 14. The initial characters "##,##," do
not affect the grouping size or maximum integer digits, but they do affect
the format width.
- Padding may be inserted at one of four locations: before the prefix,
after the prefix, before the suffix or after the suffix. If padding is
specified in any other location,
#applyPattern throws an
IllegalArgumentException. If there is no prefix, before the prefix and after
the prefix are equivalent, likewise for the suffix.
- When specified in a pattern, the 16-bit
char immediately
following the pad escape is the pad character. This may be any character,
including a special pattern character. That is, the pad escape
escapes the following character. If there is no character after
the pad escape, then the pattern is illegal.
Synchronization
DecimalFormat objects are not synchronized. Multiple threads should
not access one formatter concurrently.