demjson.py and jsonlint
This is a comprehensive python language binding to the JSON language-independent data encoding standard, which is often used as a simpler subtitute for XML in AJAX-based web applications.
Regarding Python 3.0: The newest 3.0 version of Python now includes built-in support for JSON (by absorbing the "simplejson" module by Bob Ippolito). However as many find my module useful, and with features that the competing modules don't have, I will continue to support demjson for the forseeable future.
Version 1.4 was released on 2008-12-17. It is identical to version 1.3 released on 2008-03-19, except for a change in licensing terms. Read the list of changes.
Current version:
- Download (version 1.4, 2008-12-17):
- Easy install: If you have
python setuptools installed, just type:
easy_install demjson - Single file: download demjson.py
(80 KiB)
This is just the basic module; no documentation, tests, or other things. - Whole package: download demjson-1.4.tar.gz
(53 KiB)
Extract the tar file and then run:
python setup.py install - Yum (Fedora Linux): demjson (including /usr/bin/jsonlint) is
now included in the Red Hat Fedora (Linux)
package respository as of Fedora 9 or later, as the package named
"python-demjson". You can install on Fedora if not already by typing:
yum install python-demjson
Note that you may get version 1.3 rather than 1.4, but they are functionally equivalent.
- Easy install: If you have
python setuptools installed, just type:
- Documentation:
- demjson module documentation
- jsonlint command usage
- Package metadata: PKG-INFO, License, ReadMe, Changes
- Python package index (PyPI) page (aka, cheeseshop)
Requirements
This module is written entirely in python and requires no additional components or modules apart from those in the standard python library. It requires at least python version 2.3 or higher (but not Python 3000 yet).
Note for full Unicode support of non-BMP characters, your Python interpreter must have been compiled for UCS-4 support.
Support for the Decimal type for extreme-precision floating point numbers requires Python 2.4 or greater. Only float is supported under Python 2.3.
Improvements and changes in version 1.4
These are the most important changes in this release. For full details read the comprehensive change notes.
- License changed to the less restrictive GNU Lesser General Public License (LGPL), version 3 or later.
Improvements and changes in version 1.3
- Unicode default:
The default for the
escape_unicodeparameter was changed toFalse; meaning that Unicode characters will be embedded in JSON strings directly rather than using \u-escapes if possible. - Decimal:
If you are using Python 2.4 or later, the standard
decimal module
then the
Decimaltype will be supported. Numbers which would overflow the builtinfloattype or would result is loss of precision will now be stored as aDecimalinstead. - More types handled:
Many more standard Python types are now handled and covnerted into
JSON when possible. Such as
set,deque,array, and evencomplex(when the imaginary part is zero). - RFC conformance: When in strict mode the module is now even closer to being completely conforming to the RFC specification.
Known issues/bugs:
The following are the known problems with this release, and what I plan to do about them.
- 2009-11-04:
When encoding Python strings to JSON, occurances of character U+00FF
(ASCII 255 or 0xFF) may result in an error. Until the next version, a
simple one-line patch is to replace the occurance of
range(0,255)on line 910 of demjson.py withrange(0,256). Reported by Tom Kho and Yanxin Shi.
To report bugs or suggest changes, please email me. My contact information can be found on my home page.
Older versions:
- Version 1.4 (2008-12-17): demjson.py, demjson-1.4.tar.gz.
- Version 1.3 (2008-03-19): demjson.py, demjson-1.3.tar.gz.
- Version 1.2 (2007-11-06): demjson.py, demjson-1.2.tar.gz.
- Version 1.1 (2006-11-06): demjson.py, demjson-1.1.tar.gz.
- Version 1.0 (2006-08-10): demjson.py, demjson-1.0.tar.gz.
License
This is Free Software, licensed under the terms of the GNU LGPL version 3 or later.
This basically means you are completely free to use it however you want (even in proprietary systems) for any purpose. However, if you make modifications to it and you redistribute it to others, then you must make your modifications to this software public and also release it under the same license terms. Read the license text for all the details.
Older versions may have been released under a different license; be sure to read the LICENSE.txt file that is included in each package.
See also
Standards:
- JSON Homepage (json.org)
- RFC 4627: The application/json Media Type for JavaScript Object Notation (JSON)
- ECMA 262 (PDF), 3rd. edition (1999), aka ECMAscript/JavaScript.
- IEEE 754-1985: Standard for Binary Floating-Point Arithmetic.
Other python implementations:
Read my Comparing JSON Modules report.
- jsonlib
- JsonUtils
- python-cjson
- python-json
- simplejson
- zif.jsonserver (for Zope)
- pyparsing.jsonParser (decoding only)
Other:
- JSON Wikipedia article.
- Choosing a Python JSON Translator,
by Jim Washington, February 2007.
[Note this uses demjson version 1.1; the speed of the current version is many times better than what was reported here.]
Quick example
>>> import demjson >>> demjson.encode( ['one',42,True,None] ) u'["one",42,true,null]' >>> demjson.decode( u'["one",42,true,null]' ) ['one', 42, True, None]
Module features
This implementation attempts to be as closely conforming to the JSON specification (published as IETF RFC 4627) as possible. It can also be used in a non-strict mode where it is much closer to the JavaScript/ECMAScript syntax (published as ECMA 262).
Now comes with a jsonlint tool which can be used to validate your JSON documents for strict conformance to the RFC specification; as well as to reformat them, either by re-indenting or for minimal/canonical JSON output.
It has a strict and non-strict mode when parsing JSON text, and many levels in between. The strict mode only allows input which precisely meets the syntax requirements of RFC 4627 (JSON) and no more (it is so strict it could be used as a lint-style validation checker for your JSON encodings). But when used in its non-strict mode it is much more liberal in what it accepts by following more closely the JavaScript language specification rather than the more restrictive JSON. When producing JSON though this module is, almost always, strictly conforming. — The default mode is non-strict.
Some of the distinguishing features of this module are:
- When parsing JSON, it can operate in either strict (JSON) or non-strict (JavaScript) modes.
- It can produce either compact JSON-encoded strings with extraneous whitespace removed to save bandwidth, or it can produce "pretty-printed" and indented JSON for easy readability.
- Detailed exceptions are raised on all decoding or encoding errors to aid in debugging.
- Any Python object which behaves like a sequence or a mapping can
be encoded into JSON, not just native
listordicttypes. - Any user-defined Python class, especially if it is otherwise not
directly encodable as JSON, can define a method named
json_equivalentwhich will automatically be used to help translate objects of that class into an equivalent JSON encoding. - Full Unicode support:
- Handles all recommended character encoding schemes (ASCII, UTF-8, UTF-16, and UTF-32) with auto-detection, including UTF-32 (for which native Python has no built-in codec). Byte order marks are used and generated when applicable.
- Correct handling of surrogate pairs for dealing with characters
beyond the Unicode BMP, including the use of JSON's double
\u"surrogate" escape sequences regardless of underlying encoding. - When generating JSON, all non-ASCII or control characters can
either be encoded with the
\uescape sequences for maximum portability, or optionally directly inserted as real characters for smaller size (unless the chosen character encoding can not represent the character, in which case the\umechanism is always used). - Unicode format control characters are allowed anywhere in the input (non-strict mode only).
- All Unicode line terminator characters are recognized (non-strict mode only).
- All Unicode white space characters are recognized (non-strict mode only).
- Numeric formats:
- The numbers
+0and-0are kept as distinguished values, as required by JavaScript. - Additional JSON restrictions on numeric literals can be enforced, so that
"
+3", ".45", "7.", and "01" are treated as errors. (strict mode only) - Hexadecimal number literals are recognized (e.g.,
0xA6) (non-strict mode only). - Octal integer literals may be recognized (e.g., 0137). Note that octal literals are not allowed in ECMAScript or JSON, and are only a compatability with old versions of Javascript. (non-strict mode only)
- Special non-number floating-point values are recognized and
handled, including
NaN,Infinity, and-Infinity. If your Python interpreter fully supports IEEE 754 then these will result in values of typefloat; otherwise a simulated user-defined type will be substituted. (non-strict mode only)
- The numbers
- String literals:
- String literals may use either single or double quote marks (non-strict mode only).
- Strings may contain
\x(hexadecimal) escape sequences, as well as the\vand\0escape sequences omitted from JSON (non-strict mode only). If octal numbers are enabled then deprecated JavaScript octal escape sequences are also permitted.
- The
undefinedkeyword is recognized (non-strict mode only). - Lists may have omitted (elided) elements, e.g.,
[,,,,,], with missing elements interpreted asundefinedvalues (non-strict mode only). - Object property names (dictionary keys) can be of any of the types: string literals, numbers, or identifiers (the later of which are treated as if they are string literals)---as permitted by JavaScript (non-strict mode only). When in strict mode only string literals may be keys.
- JavaScript comments (
//and/* */style) are treated as whitespace (non-strict mode only).
Each of the features (behaviors) listed above which work only when in a non-strict mode can be allowed or prevented individually. Thus for example you can choose to allow comments but still prevent the use of hexadecimal numbers.
Coming later
I am working on a new version. Besides bug fixes, I hope to improve performance significantly.
Does not work with Python 3000 (aka 3.0) yet. I am working on a native rewrite which will work properly in the new language (including using the correct semantics for bytes and strings).


