Remote code execution from unsafe YAML deserialization in Python

Critical Risk deserialization

What it is

Using yaml.load() with untrusted input can execute arbitrary Python code.

Why it happens

Calling yaml.load() without safe_load() on user-provided YAML, allowing arbitrary Python object instantiation.

Root causes

Using yaml.load() on Untrusted Input

Calling yaml.load() without safe_load() on user-provided YAML, allowing arbitrary Python object instantiation.

Legacy Code Patterns

Maintaining older code that uses yaml.load() before PyYAML introduced safe_load() alternative.

Convenience Over Security

Using full yaml.load() to support custom Python objects without understanding RCE risks.

Fixes

1

Use yaml.safe_load()

Replace all yaml.load() calls with yaml.safe_load(), which only constructs simple Python objects.

2

Specify SafeLoader Explicitly

When using yaml.load(), always pass Loader=yaml.SafeLoader to prevent unsafe deserialization.

3

Validate YAML Schema

Parse YAML with safe_load() and validate structure against a schema before processing.

Detect This Vulnerability in Your Code

Sourcery automatically identifies remote code execution from unsafe yaml deserialization in python and many other security issues in your codebase.