XML to JSON Conversion: Conventions, Pitfalls & Code Examples
You pull a response off a SOAP endpoint, an RSS feed, or a sitemap.xml, and it’s XML. Your stack is JSON-native: JavaScript on the front end, REST in the middle, a document store at the bottom. So you need to convert XML to JSON, and you reach for a parser expecting it to be a one-liner.
It usually is — until the output bites you. An array you expected turns out to be a single object. An id attribute vanishes. A ZIP code like 01234 comes back as the number 1234. None of these are bugs in your parser. They’re the consequence of mapping two data models that don’t line up, and the only way to convert XML to JSON reliably is to understand the conventions that bridge the gap.
This guide covers why those conventions exist, four ways to do the conversion (browser, JavaScript, Python, CLI), the @_ and #text rules every major library shares, the five pitfalls that cause silent data loss, and how to convert JSON back to XML for a clean round-trip. Paste the examples into Node, Python, or a shell and they produce the output shown in the comments.
Why XML-to-JSON Needs Conventions (Not Just a Reformat)
XML and JSON look similar at a glance: both are trees of named, nested data. But their underlying models diverge. XML elements can carry attributes, hold mixed content (text interleaved with child elements), and live under namespaces. JSON has none of those concepts. It has objects, arrays, and four scalar types. Converting one to the other isn’t reformatting; it’s translating between two grammars, and one of them has words the other can’t spell.
Before you convert anything, it pays to confirm the source is actually valid. A stray unescaped & or a mismatched tag will reject at the parser, so running the input through an XML Formatter to check well-formedness first saves a round of confusing errors.
Here is where the two models pull apart:
| Dimension | XML | JSON |
|---|---|---|
| Node types | elements, attributes, text, mixed content | objects, arrays, string, number, boolean, null |
| Root constraint | exactly one root element required | no root constraint |
| Attributes | yes (id="P01") | none (needs an @_ convention) |
| Repeated elements | same-named siblings are legal | object keys can’t repeat (needs an array convention) |
| Type system | text is untyped — everything is a string | native types |
| Namespaces | yes (xmlns) | none |
Because the models don’t match, every XML-to-JSON conversion is convention-driven, not lossless reformatting. The conventions aren’t arbitrary, though: fast-xml-parser (Node.js), xmltodict (Python), and JAXB (Java) all landed on the same two markers, @_ for attributes and #text for mixed-content text. Learn them once and they transfer across runtimes. Data-shape mismatches like this show up in other conversions too, such as the type-inference questions in the CSV to JSON conversion guide.
How to Convert XML to JSON: 4 Methods
Pick the method that fits your context: a quick one-off paste, a Node service, a Python pipeline, or a shell script in CI.
Method 1 — Browser-Based Tool (Zero Setup, Privacy-First)
For a one-off conversion, or for XML you’d rather not paste into a random website, an in-browser converter is the fastest path. Paste XML into the XML to JSON Converter, and the JSON appears instantly — no install, no account, no upload. Everything runs in your browser’s JavaScript engine, so the data never leaves the machine.
That detail matters here. SOAP envelopes carry WS-Security tokens, internal configs carry connection strings, and exports carry customer records. Because nothing is transmitted, the tool is safe for XML containing credentials or sensitive payloads. You can confirm it yourself: open the Network tab and watch zero requests fire as you convert.
Method 2 — JavaScript / Node.js (fast-xml-parser)
In Node, fast-xml-parser is the standard choice. The defaults will surprise you, though — attributes are ignored and values get coerced — so the options below are the ones you actually want for a faithful conversion:
// Convert XML to JSON in Node.js using fast-xml-parser
import { XMLParser } from 'fast-xml-parser';
const xml = `<catalog>
<product id="P01">
<name>Wireless Headphones</name>
<price currency="USD">79.99</price>
</product>
</catalog>`;
const parser = new XMLParser({
ignoreAttributes: false, // keep attributes (default drops them!)
attributeNamePrefix: '@_', // attributes become @_-prefixed keys
textNodeName: '#text', // mixed-content text goes under #text
parseAttributeValue: false, // no type coercion on attributes
parseTagValue: false, // no type coercion on element text
});
const result = parser.parse(xml);
console.log(JSON.stringify(result, null, 2));
// {
// "catalog": {
// "product": {
// "@_id": "P01",
// "name": "Wireless Headphones",
// "price": {
// "@_currency": "USD",
// "#text": "79.99"
// }
// }
// }
// }
The two settings people forget are ignoreAttributes: false and parseTagValue: false. The first keeps your id and currency attributes; the second stops the parser from turning "79.99" into a float and "01234" into 1234. We’ll come back to why string preservation is the safe default in the pitfalls section.
If you want zero dependencies in the browser, the native DOMParser does the parsing for you, and you walk the DOM yourself:
// Zero-dependency XML to JSON in the browser using DOMParser
function xmlToJson(node) {
// Text-only element → string value
const children = Array.from(node.children);
if (children.length === 0 && node.attributes.length === 0) {
return node.textContent.trim();
}
const obj = {};
// Attributes → @_ prefix
for (const attr of node.attributes) {
obj['@_' + attr.name] = attr.value;
}
// Element with attributes AND text → #text
if (children.length === 0) {
obj['#text'] = node.textContent.trim();
return obj;
}
// Recurse into children, collecting same-named siblings into arrays
for (const child of children) {
const value = xmlToJson(child);
if (obj[child.tagName] === undefined) {
obj[child.tagName] = value;
} else {
if (!Array.isArray(obj[child.tagName])) obj[child.tagName] = [obj[child.tagName]];
obj[child.tagName].push(value);
}
}
return obj;
}
const doc = new DOMParser().parseFromString(
'<catalog><product id="P01"><name>Wireless Headphones</name></product></catalog>',
'text/xml'
);
const json = { [doc.documentElement.tagName]: xmlToJson(doc.documentElement) };
console.log(JSON.stringify(json, null, 2));
// { "catalog": { "product": { "@_id": "P01", "name": "Wireless Headphones" } } }
DOMParser is XML 1.0 compliant, handles CDATA and entity references, and reports well-formedness errors — all without a package install. The trade-off is that you own the traversal logic, including the array-collection rule shown above.
Method 3 — Python (xmltodict)
In Python, xmltodict collapses the whole job into a short pipeline. It uses @ as its attribute prefix and #text for mixed content by default:
# Convert XML to JSON in Python using xmltodict
import json
import xmltodict
xml = """<catalog>
<product id="P01">
<name>Wireless Headphones</name>
<price currency="USD">79.99</price>
</product>
</catalog>"""
data = xmltodict.parse(xml)
print(json.dumps(data, indent=2))
# {
# "catalog": {
# "product": {
# "@id": "P01",
# "name": "Wireless Headphones",
# "price": {
# "@currency": "USD",
# "#text": "79.99"
# }
# }
# }
# }
By default xmltodict keeps every value as a string, which is the behavior you want. The one option worth knowing up front is force_list, which fixes the single-versus-many array problem before it reaches your code:
# force_list guarantees <product> is always a list, even when there is one
data = xmltodict.parse(xml, force_list={'product'})
products = data['catalog']['product'] # always a list now
for p in products:
print(p['name'])
Without force_list, one <product> yields a dict and two yield a list — and your loop crashes on the single-item case. That’s pitfall #1, which we cover below.
Method 4 — CLI (yq / Python one-liner)
For shell scripts and CI pipelines, two one-liners cover most cases. Mike Farah’s yq reads XML and emits JSON directly:
# Using yq (Mike Farah's Go version)
yq -p=xml -o=json '.' input.xml
# Pipe from stdin
cat sitemap.xml | yq -p=xml -o=json '.'
If xmltodict is already in your environment, the Python one-liner needs no extra binary:
python3 -c "import sys, xmltodict, json; print(json.dumps(xmltodict.parse(sys.stdin.read()), indent=2))" < input.xml
Both stream from stdin, so they drop straight into a pipeline — useful for converting an API response mid-script or normalizing a batch of files in a build step.
The @_ Attribute and #text Conventions Explained
Most converter pages skip the part that actually matters: what the odd-looking @_ and #text keys mean and why they exist. Once these click, the output stops looking arbitrary.
Attributes map to @_-prefixed keys. An attribute has no JSON equivalent — there’s no slot in an object for “metadata about this object” that’s distinct from a child. The convention is to give attributes a key prefixed with @_:
<user id="42" role="admin"/>
→ { "user": { "@_id": "42", "@_role": "admin" } }
Why @_ specifically? Because no valid XML element name can start with @, the prefix can never collide with a real child-element key. The character is reserved for free. (xmltodict uses bare @; fast-xml-parser uses @_ by default. The principle is identical.)
Mixed content maps to #text. When an element has both an attribute and a text value, the text needs somewhere to live alongside the attribute keys. That’s #text:
<price currency="USD">29.99</price>
→ { "price": { "@_currency": "USD", "#text": "29.99" } }
Plain-text elements become a direct string value. No attributes, no children, just text — so there’s no need for the #text indirection. <name>Alice</name> becomes "name": "Alice". The #text key only appears when attributes force the element value to be an object.
This asymmetry is the source of a subtle bug. The same element name can produce a plain string in one document and an @_/#text object in another, depending on whether that particular instance carried an attribute. A <price> with no currency attribute is the string "29.99"; the same <price currency="USD"> is { "@_currency": "USD", "#text": "29.99" }. Code that reads node.price directly works for one shape and silently breaks on the other. The defensive accessor is to check the type: const amount = typeof node.price === 'object' ? node.price['#text'] : node.price;.
CDATA becomes plain text content. A <![CDATA[if (a < b) return;]]> section is just an escaping mechanism, so the delimiters are stripped and the inner text is preserved: "if (a < b) return;". Nothing special survives into the JSON.
Once you have output, paste it into a JSON Formatter to validate the JSON output and confirm the structure matches what your consumer expects before you wire it into code.
5 XML-to-JSON Pitfalls & How to Avoid Them
These are the failures that get past code review and show up in production. Each one traces back to the model mismatch from the start of this guide.
1. Array ambiguity (one vs. many). A single <item> becomes an object; two or more become an array. The JSON shape depends on how many siblings happened to be in that particular document. Consumer code like result.items.item.forEach(...) works in testing — where your fixture has three items — and throws TypeError: not a function in production when a record has exactly one.
// Two <book> siblings → array
// <library><book>A</book><book>B</book></library>
// → { "library": { "book": ["A", "B"] } }
// One <book> → object, NOT an array
// <library><book>A</book></library>
// → { "library": { "book": "A" } }
// Normalize so both cases behave identically
const books = [].concat(result.library?.book ?? []);
books.forEach(b => console.log(b)); // safe for 0, 1, or many
The [].concat(x ?? []) idiom is worth memorizing: a missing value becomes [], a single object becomes [object], and an existing array passes through unchanged. In Python, pass force_list={'book'} to xmltodict.parse() and the value is always a list, so you skip the normalization entirely.
2. Attributes silently dropped. Several libraries default to ignoring attributes — fast-xml-parser does exactly this until you set ignoreAttributes: false. The conversion looks like it worked, the JSON parses fine, and your id, currency, and status values are simply gone. Always set the flag explicitly rather than trusting the default.
3. Namespace flattening. An xmlns declaration becomes an ordinary @_xmlns key, and the prefix in <soap:Body> survives only as part of the string key "soap:Body". The semantics — that two prefixes might bind to the same URI — are lost.
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
<soap:Body>...</soap:Body>
</soap:Envelope>
→ {
"soap:Envelope": {
"@_xmlns:soap": "http://schemas.xmlsoap.org/soap/envelope/",
"soap:Body": "..."
}
}
The prefix soap: is now just text in a key name; nothing knows it’s a namespace. If two elements from different namespaces share a local name, they can collide. When precise namespace handling is part of the requirement, keep the data in a namespace-aware parser and don’t flatten it into JSON at all.
4. No type coercion — and that’s correct. <zip>01234</zip> must not become 1234. Account codes, postal codes, padded identifiers, and precision-sensitive decimals all break under silent coercion. A good converter keeps everything as a string and lets you coerce deliberately:
// Don't rely on implicit coercion
if (config.timeout > 25) { /* fragile: "30" > 25 happens to work */ }
// Coerce explicitly, only where you know the type
if (parseInt(config.timeout, 10) > 25) { /* safe */ }
5. Lossy: comments, processing instructions, and mixed-content order. XML comments (<!-- ... -->) and processing instructions (<?xml-stylesheet ?>) have no JSON home and are discarded. The relative order of text interleaved with child elements may not round-trip. If you need every byte preserved — for re-emitting the exact source document — don’t convert at all; use an XML Formatter to reformat or minify without touching the data model.
Converting JSON Back to XML (Round-Trip)
Going the other direction has its own twist, because JSON has no root-element rule and XML requires exactly one. The companion JSON to XML Converter applies the same @_/#text conventions in reverse, so a JSON → XML → JSON trip preserves attributes, text, and structure.
The interesting part is root normalization. The converter resolves the single-root requirement with four rules:
- Single-key object → that key becomes the root:
{ "config": {...} }→<config>...</config>. - Multi-key object → wrapped in
<root>:{ "a": 1, "b": 2 }→<root><a>1</a><b>2</b></root>. - Top-level array → wrapped as
<root><item>...</item></root>, with<item>as a fixed fallback name. - Primitive value →
<root>value</root>.
Everything else mirrors the forward direction. @_ keys become attributes, #text becomes text content, and a JSON array under a key produces repeated same-named siblings — the key name is reused, never singularized:
// Convert JSON to XML in Node.js using fast-xml-parser
import { XMLBuilder } from 'fast-xml-parser';
const data = {
catalog: {
product: {
'@_id': 'P01',
name: 'Wireless Headphones',
price: { '@_currency': 'USD', '#text': '79.99' },
},
},
};
const builder = new XMLBuilder({
attributeNamePrefix: '@_', // @_ keys become attributes
textNodeName: '#text', // #text key becomes text content
ignoreAttributes: false, // process @_ keys
format: true, // pretty-print
});
console.log(builder.build(data));
// <catalog>
// <product id="P01">
// <name>Wireless Headphones</name>
// <price currency="USD">79.99</price>
// </product>
// </catalog>
One detail the builder handles for you: special characters in text and attribute values (<, >, &, ") are escaped to their entity references, so the output stays well-formed.
FAQ
How do XML attributes map to JSON?
Attributes become keys prefixed with @_, so id="42" turns into "@_id": "42". This is the shared convention of fast-xml-parser and xmltodict, and the prefix never collides with element names because no valid element name starts with @.
Why does XML to JSON keep numbers as strings?
Because the converter does no type coercion. Forcing 01234 into 1234 would drop a meaningful leading zero from ZIP codes, account numbers, and padded IDs. Keeping every value as a string is the safe default; coerce deliberately downstream where you know the type.
Is XML to JSON conversion lossless?
No. Comments and processing instructions are discarded, namespace semantics are only partially preserved, and mixed-content ordering may not round-trip. When you need every byte preserved, use an XML Formatter to reformat the XML instead of converting it to JSON.
How are repeated XML elements handled in JSON?
A single same-named child becomes an object; two or more become an array. Because the shape depends on sibling count, your consumer code should always normalize to an array so it handles both the one-item and many-item cases without crashing.
What happens to XML namespaces when converting to JSON?
An xmlns declaration becomes an ordinary @_xmlns key, and the prefix stays inside the element-name string, as in "soap:Body". The semantic binding of a prefix to a URI is not interpreted, so distinct namespaces can flatten together.
How do I convert JSON back to XML?
Use the companion JSON to XML Converter. It applies the same @_ and #text conventions in reverse, so attributes, text content, and arrays map back symmetrically. That symmetry is what makes a clean JSON → XML → JSON round-trip possible.
Can I convert XML with multiple root elements?
No. Multiple top-level elements are not well-formed XML, so the parser rejects the input. Wrap the fragments in a single root element first — turn <a/><b/> into <root><a/><b/></root> — then convert.
Conclusion
XML-to-JSON conversion is convention-driven, not a reformat. The rules are consistent across runtimes: attributes map to @_ keys, mixed-content text to #text, repeated siblings to arrays, and values stay strings so leading zeros and precision survive. The traps to remember are the single-versus-array shape shift, silently dropped attributes, and the loss of comments and namespace semantics. None of those are bugs; all of them are predictable once you know the model mismatch behind them.
When you need a quick, private conversion, paste into the XML to JSON Converter — it runs entirely in your browser. Validate the source first with the XML Formatter, and go the other direction with the JSON to XML Converter when you need round-trip XML. For more on how data-format models shape conversion behavior, see the notes on YAML and JSON differences.