demjson.py and jsonlint
This is a comprehensive python language
binding to the
JSON
language-independent data encoding standard, which is often used as a
simpler subtitute for
XML in
AJAX-based
web applications.
The current version is Version 1.6,
which was released on
2011-04-01.
It is a minor bugfix to version 1.5 released on
2010-10-10.
Mirrors and source code
Download and install
- PIP install: If you have pip installed, just type:
pip install demjson
or possibly
pip-python install demjson
- Easy install: If you have
setuptools installed, just type:
easy_install demjson
- Single file: download demjson.py
(86 KiB)
This is just the basic module; no documentation, tests, or other things.
- Whole package: download demjson-1.6.tar.gz
(63 KiB)
Extract the tar file and then run:
python setup.py install
- Yum (Fedora Linux): demjson (including /usr/bin/jsonlint) is
now included in the Red Hat Fedora (Linux)
package respository as of Fedora 9 or later, as the package named
"python-demjson". You can install on Fedora if not already by typing:
yum install python-demjson
Check the Fedora build page
for the latest build information.
Documentation
Documentation for the current version includes:
Requirements
This module is written entirely in
python
and requires no additional components or modules apart from those in
the standard python library. It requires at least python version 2.3
or higher (but not Python 3.x yet).
Note for full Unicode support of non-BMP characters, your Python interpreter
must have been compiled for UCS-4 support.
Support for the Decimal type
for extreme-precision floating point numbers
requires Python 2.4 or greater. Only float is supported under
Python 2.3.
Regarding Python 3.0: The newest 3.0 version of Python now includes built-in support for
JSON (by absorbing the "simplejson" module by Bob Ippolito). However
as many find my module useful, and with features that the competing
modules don't have, I will continue to support demjson for the
forseeable future.
Improvements and changes in version 1.6
- Bug fix: The jsonlint tool failed to accept a JSON document from
standard input (stdin). Also added a --version and --copyright
option support to jsonlint.
Improvements and changes in version 1.5
- Bug fix: When encoding Python strings to JSON, occurances of character U+00FF
(ASCII 255 or 0xFF) may result in an error.
Improvements and changes in version 1.4
These are the most important changes in this release. For
full details read the comprehensive change notes.
- License changed to the less restrictive GNU Lesser General Public License (LGPL), version 3 or later.
Improvements and changes in version 1.3
- Unicode default:
The default for the
escape_unicode parameter was
changed to False; meaning that Unicode characters
will be embedded in JSON strings directly rather than using
\u-escapes if possible.
- Decimal:
If you are using Python 2.4 or later, the standard
decimal module
then the
Decimal type will be supported. Numbers which
would overflow the builtin float type or would result
is loss of precision will now be stored as a Decimal instead.
- More types handled:
Many more standard Python types are now handled and covnerted into
JSON when possible. Such as
set, deque,
array, and even complex (when the
imaginary part is zero).
- RFC conformance:
When in strict mode the module is now even closer to being
completely conforming to the RFC specification.
Known issues and bugs
The following are the known problems with this release, and what
I plan to do about them.
- Will not work in Python 3.x.
To report bugs or suggest changes, please email me. My
contact information can be found on my home page.
Older versions
License
This is Free Software, licensed under the terms of the
GNU LGPL
version 3 or later.
This basically means you are completely free to use it however you
want (even in proprietary systems) for any purpose. However, if you
make modifications to it and you redistribute it to others,
then you must make your modifications to this software public and
also release it under the same license terms. Read the license text for
all the details.
Older versions may have been released under a different license;
be sure to read the LICENSE.txt file that is included in each
package.
See also
Standards:
Other python implementations:
Read my Comparing JSON Modules report.
Useful links:
|
Quick example
>>> import demjson
>>> demjson.encode( ['one',42,True,None] )
u'["one",42,true,null]'
>>> demjson.decode( u'["one",42,true,null]' )
['one', 42, True, None]
Module features
This implementation attempts to be as closely conforming to
the JSON specification (published as
IETF
RFC 4627)
as possible. It can also be used in a non-strict mode where it
is much closer to the JavaScript/ECMAScript syntax (published as
ECMA
262).
Now comes with a jsonlint tool which can be used to
validate your JSON documents for strict conformance to the
RFC specification; as well as to reformat them, either by
re-indenting or for minimal/canonical JSON output.
It has a strict and non-strict mode when parsing JSON text, and
many levels in between. The strict mode only allows input which
precisely meets the syntax requirements of RFC 4627 (JSON) and no more
(it is so strict it could be used as a lint-style validation checker
for your JSON encodings). But when used in its non-strict mode it is
much more liberal in what it accepts by following more closely the
JavaScript language specification rather than the more restrictive
JSON. When producing JSON though this module is, almost always,
strictly conforming. — The default mode is non-strict.
Some of the distinguishing features of this module are:
- When parsing JSON, it can operate in either strict (JSON) or
non-strict (JavaScript) modes.
- It can produce either compact JSON-encoded strings with extraneous
whitespace removed to save bandwidth, or it can produce
"pretty-printed" and indented JSON for easy readability.
- Detailed exceptions are raised on all decoding or encoding
errors to aid in debugging.
- Any Python object which behaves like a sequence or a mapping can
be encoded into JSON, not just native
list or
dict types.
- Any user-defined Python class, especially if it is otherwise not
directly encodable as JSON, can define a method named
json_equivalent which will automatically be used to help
translate objects of that class into an equivalent JSON encoding.
- Full Unicode support:
- Handles all recommended character encoding schemes
(ASCII,
UTF-8,
UTF-16, and UTF-32) with auto-detection, including UTF-32
(for which native Python has no built-in codec). Byte order marks
are used and generated when applicable.
- Correct handling of surrogate pairs for dealing with characters
beyond the Unicode BMP, including the use of JSON's double
\u "surrogate" escape sequences regardless of
underlying encoding.
- When generating JSON, all non-ASCII or control characters can
either be encoded with the
\u escape sequences for
maximum portability, or optionally directly inserted as real
characters for smaller size (unless the chosen character encoding
can not represent the character, in which case the \u
mechanism is always used).
- Unicode format control characters are allowed anywhere in the
input (non-strict mode only).
- All Unicode line terminator characters are recognized
(non-strict mode only).
- All Unicode white space characters are recognized (non-strict
mode only).
- Numeric formats:
- The numbers
+0 and -0 are kept
as distinguished values, as required by JavaScript.
- Additional JSON restrictions on numeric literals can be enforced, so that
"
+3", ".45", "7.", and
"01" are treated as errors. (strict mode only)
- Hexadecimal number literals are recognized (e.g.,
0xA6) (non-strict mode only).
- Octal integer literals may be recognized (e.g., 0137). Note
that octal literals are not allowed in ECMAScript or JSON, and are
only a compatability with old versions of Javascript. (non-strict
mode only)
- Special non-number floating-point values are recognized and
handled, including
NaN,
Infinity, and -Infinity. If your Python
interpreter fully supports
IEEE
754 then these will result in values of type float;
otherwise a simulated user-defined type will be substituted.
(non-strict mode only)
- String literals:
- String literals may use either single or double quote
marks (non-strict mode only).
- Strings may contain
\x (hexadecimal) escape
sequences, as well as the \v and \0 escape
sequences omitted from JSON (non-strict mode only). If octal
numbers are enabled then deprecated JavaScript octal escape
sequences are also permitted.
- The
undefined keyword is recognized (non-strict mode only).
- Lists may have omitted (elided) elements, e.g.,
[,,,,,],
with missing elements interpreted as undefined values
(non-strict mode only).
- Object property names (dictionary keys) can be of any of the
types: string literals, numbers, or identifiers (the later of which
are treated as if they are string literals)---as permitted by
JavaScript (non-strict mode only). When in strict mode only string
literals may be keys.
- JavaScript comments (
// and /* */ style)
are treated as whitespace (non-strict mode only).
Each of the features (behaviors) listed above which work only when
in a non-strict mode can be allowed or prevented individually. Thus
for example you can choose to allow comments but still prevent the use
of hexadecimal numbers.
Coming later
I am working on a new version. Besides bug fixes, I hope
to improve performance significantly.
Does not work with Python 3.x yet.
I am working on a native rewrite which will work properly in the new
language (including using the correct semantics for bytes and
strings).
|