Express.js XML External Entity (XXE) via xml2json Event Handler

High Risk XML External Entity (XXE)
expressxxexmlxml2jsonjavascriptexternal-entities

What it is

The Express.js application uses xml2json library with event handlers that are vulnerable to XML External Entity (XXE) attacks. This occurs when XML parsing is configured to process external entities, allowing attackers to read local files, perform SSRF attacks, or cause denial of service through entity expansion.

// Vulnerable: xml2json with default settings
const xml2json = require('xml2json');

app.post('/parse-xml', (req, res) => {
  const xmlData = req.body.xml;
  const json = xml2json.toJson(xmlData, {
    object: true,
    reversible: false
  });
  res.json(json);
});
// Secure: Disable external entities
const xml2json = require('xml2json');
const libxmljs = require('libxmljs');

app.post('/parse-xml', (req, res) => {
  const xmlData = req.body.xml;
  try {
    // Parse with NOENT flag to disable external entities
    const doc = libxmljs.parseXml(xmlData, { noent: false });
    const json = xml2json.toJson(doc.toString(), {
      object: true,
      reversible: false
    });
    res.json(json);
  } catch (error) {
    res.status(400).json({ error: 'Invalid XML' });
  }
});

💡 Why This Fix Works

The vulnerable code was updated to address the security issue.

Why it happens

The xml2json library uses libxmljs as its XML parser which defaults to processing external entities and DTD declarations unless explicitly disabled. Applications use xml2json.toJson(xmlString) or xml2json.toJson(xmlString, {}) without security-specific options, inheriting vulnerable defaults. When parsing untrusted XML containing entity declarations like <!ENTITY xxe SYSTEM "file:///etc/passwd">, the parser resolves these entities, enabling file disclosure, SSRF, or denial of service through entity expansion attacks (billion laughs attack).

Root causes

xml2json Default Settings Allow External Entities

The xml2json library uses libxmljs as its XML parser which defaults to processing external entities and DTD declarations unless explicitly disabled. Applications use xml2json.toJson(xmlString) or xml2json.toJson(xmlString, {}) without security-specific options, inheriting vulnerable defaults. When parsing untrusted XML containing entity declarations like <!ENTITY xxe SYSTEM "file:///etc/passwd">, the parser resolves these entities, enabling file disclosure, SSRF, or denial of service through entity expansion attacks (billion laughs attack).

Processing Untrusted XML Without Disabling External Entities

Express applications parse XML from user uploads, API requests, or webhooks without explicitly disabling external entity resolution in the parser configuration. Developers use xml2json for convenience of converting XML to JSON objects but overlook security implications. Applications process XML from Content-Type: application/xml POST requests, SOAP messages, RSS/Atom feeds, or SVG file uploads without setting {sanitize: false, parseOptions: {noent: false}} options that disable entity expansion. Untrusted XML is treated as data format rather than potential code execution vector.

Event-Driven Parsing Without Security Configuration

Applications use xml2json's event-based parsing with parseString() or event handlers without configuring security settings on the underlying libxmljs parser. Event-driven parsing processes XML incrementally through callbacks but uses same vulnerable entity processing as synchronous parsing. Developers implement custom event handlers for XML elements without understanding that entity resolution occurs before events fire, meaning malicious external entity content is fetched and expanded before application code can validate or sanitize it.

Missing Pre-Parse XML Structure Validation

Applications fail to validate XML structure, size limits, or suspicious patterns before passing to xml2json parser. No checks for presence of DOCTYPE declarations, ENTITY keywords, SYSTEM/PUBLIC references, or file:// URLs in XML input. Applications don't implement max file size limits to prevent DoS through large entity expansions. XML passes directly from req.body to parser without intermediate inspection. Lack of pre-parse validation allows attackers to submit specially crafted XXE payloads that exploit parser before application logic can detect anomalies.

Explicitly Enabling DOCTYPE Declarations

Some applications explicitly enable DOCTYPE processing thinking it's required for valid XML: xml2json.toJson(xmlString, {parseOptions: {dtdload: true, dtdvalid: true}}). DTD (Document Type Definition) loading and validation are the mechanisms that enable external entity attacks. Developers enable DTD features for XML schema validation or legacy XML compatibility without understanding security implications. Even well-intentioned XML validation using DTDs introduces XXE vulnerabilities if external entity resolution isn't disabled separately through noent option.

Fixes

1

Disable External Entity Processing in xml2json Configuration

Configure xml2json to explicitly disable external entity resolution and DTD processing using libxmljs options: xml2json.toJson(xmlString, {sanitize: false, parseOptions: {noent: false, dtdload: false, dtdattr: false, dtdvalid: false}}). The noent: false option disables entity substitution (default is true which enables entities), while dtdload: false prevents loading external DTDs. Apply these options to all xml2json.toJson() calls processing untrusted input. Test configuration by attempting to parse XML with external entities to verify they're rejected or not resolved.

2

Use Secure-By-Default XML Parsing Libraries

Replace xml2json with modern XML libraries that default to secure configurations or provide simplified secure APIs. Consider fast-xml-parser with explicit entity expansion disabled: const parser = new XMLParser({allowBooleanAttributes: true, ignoreAttributes: false, parseTagValue: false, processEntities: false}). For applications requiring xml2json specifically, create wrapper function that encapsulates secure configuration: function secureXmlParse(xml) { return xml2json.toJson(xml, {sanitize: false, parseOptions: {noent: false, dtdload: false}}); }. Use wrapper consistently across application to prevent configuration errors.

3

Implement Pre-Parse XML Input Validation and Sanitization

Validate XML input before parsing to detect and reject XXE attack patterns. Check for suspicious patterns: if (xmlString.match(/<!ENTITY|<!DOCTYPE|SYSTEM|PUBLIC/i)) reject. Implement file size limits (reject XML > 1MB for typical use cases) to prevent denial of service through entity expansion. Validate Content-Type headers match expected application/xml. Remove or sanitize DOCTYPE declarations before parsing: xmlString.replace(/<!DOCTYPE[^>]*>/gi, ''). Implement XML schema validation (XSD) to enforce expected structure before processing. Log rejected XML payloads for security monitoring.

4

Implement Strict XML Schema Validation

Define and enforce XML Schema (XSD) validation for all accepted XML formats using libraries like libxmljs.parseXmlString() with schema validation enabled. Create schemas that allowlist expected elements, attributes, and structure while rejecting unexpected content. Validate XML against schema before conversion to JSON. Schema validation provides defense-in-depth by rejecting malformed or unexpected XML before entity processing can occur. Maintain schemas in version control alongside application code. For dynamic XML formats, implement strict validation of element names and attributes against predefined sets.

5

Migrate to JSON for Data Exchange Where Possible

Evaluate whether XML is truly required for the use case or if JSON can replace it. Modern APIs prefer JSON for its simplicity and lack of complex features like entity processing. Migrate XML-based APIs to JSON where feasible: replace Content-Type: application/xml with application/json, convert client integrations to send JSON instead of XML. For legacy XML requirements (SOAP, RSS, SVG), isolate XML processing to specific endpoints with enhanced security monitoring. Document rationale for any XML usage and security controls applied. JSON eliminates entire class of XXE vulnerabilities while providing better performance and developer experience.

6

Configure libxmljs with NOENT and Security Flags

When using libxmljs directly (xml2json's underlying parser), configure all security-relevant parsing options: libxml.parseXmlString(xmlString, {noent: false, dtdload: false, dtdattr: false, dtdvalid: false, nonet: true}). The noent: false disables entity substitution, nonet: true blocks network access for external entity fetching, dtd* options disable DTD processing. Set XML_PARSE_NOENT to false via parseOptions in all parsing operations. Create centralized XML parsing utility that applies these options consistently. Monitor parser errors and security events through application logging to detect XXE exploitation attempts.

Detect This Vulnerability in Your Code

Sourcery automatically identifies express.js xml external entity (xxe) via xml2json event handler and many other security issues in your codebase.