Injection from user-controlled format string in printf-family functions

Critical Risk memory-safety
ccppformat-stringprintfmemory-corruptioninjectionrcearbitrary-readarbitrary-write

What it is

Format string vulnerabilities occur when user-controlled input is used directly as the format string in printf-family functions. This allows attackers to read arbitrary memory, write to arbitrary memory locations, and potentially achieve remote code execution by exploiting format specifiers like %x, %n, and %s.

/* VULNERABLE: Format string vulnerabilities */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdarg.h>

// VULNERABLE: Direct user input as format string
void vulnerable_log_message(const char* user_message) {
    // VULNERABLE: user_message used directly as format string
    printf(user_message);
    // Attacker can use: "%x %x %x %x" to read stack memory
    // Or "%n" to write to memory locations
}

// VULNERABLE: Error reporting function
void vulnerable_error(const char* error_msg) {
    // VULNERABLE: User-controlled format string
    fprintf(stderr, error_msg);
    fprintf(stderr, "\n");
}

// VULNERABLE: sprintf without bounds checking
void vulnerable_format_message(char* buffer, const char* template, const char* user_data) {
    // VULNERABLE: Both unbounded and potential format string issues
    sprintf(buffer, template, user_data);
    // If template comes from user input, this is vulnerable
    // Also no bounds checking on buffer
}
/* SECURE: Safe format string usage */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdarg.h>
#include <ctype.h>

// SECURE: Constant format string with user data as argument
void secure_log_message(const char* user_message) {
    // SECURE: Constant format string, user data as argument
    printf("%s", user_message);
    // Even if user_message contains format specifiers, they won't be interpreted
}

// SECURE: Error reporting with safe format
void secure_error(const char* error_msg) {
    // SECURE: Constant format string
    fprintf(stderr, "Error: %s\n", error_msg);
}

// SECURE: Bounded string formatting
int secure_format_message(char* buffer, size_t buffer_size, const char* template, const char* user_data) {
    // SECURE: Use snprintf with bounds checking
    // template should be a trusted constant format string
    int result = snprintf(buffer, buffer_size, template, user_data);
    
    // Check for truncation
    if (result >= buffer_size) {
        fprintf(stderr, "Warning: Message truncated\n");
        return -1;
    }
    
    return result;
}

// SECURE: Input sanitization function
void secure_sanitize_for_display(const char* input, char* output, size_t output_size) {
    if (!input || !output || output_size == 0) return;
    
    size_t input_len = strlen(input);
    size_t max_copy = output_size - 1;  // Leave space for null terminator
    size_t copy_len = (input_len < max_copy) ? input_len : max_copy;
    
    for (size_t i = 0; i < copy_len; i++) {
        char c = input[i];
        
        // Replace potentially dangerous characters
        if (c == '%') {
            output[i] = '#';  // Replace % with #
        } else if (c == '\n' || c == '\r') {
            output[i] = ' ';  // Replace newlines with spaces
        } else if (!isprint(c)) {
            output[i] = '?';  // Replace non-printable with ?
        } else {
            output[i] = c;
        }
    }
    
    output[copy_len] = '\0';
}

💡 Why This Fix Works

The vulnerable examples show various ways user-controlled data can be used as format strings, enabling memory disclosure and arbitrary memory writes. The secure alternatives use constant format strings, sanitize user input, validate format strings when dynamic formatting is necessary, and implement bounded string operations to prevent both format string attacks and buffer overflows.

Why it happens

User input or untrusted data is passed directly as the first argument to printf, fprintf, sprintf, or similar functions without using a constant format string.

Root causes

Direct User Input as Format String

User input or untrusted data is passed directly as the first argument to printf, fprintf, sprintf, or similar functions without using a constant format string.

Unsafe Logging Functions

Custom logging or error reporting functions that accept user data and pass it directly to printf-family functions without proper format string validation.

Unbounded sprintf Usage

Using sprintf or vsprintf without buffer size limits, combined with user-controlled format strings, can lead to both format string attacks and buffer overflows.

Fixes

1

Use Constant Format Strings

Always use a constant format string as the first argument to printf-family functions, and pass user data as subsequent arguments (e.g., printf("%s", user_input) instead of printf(user_input)).

2

Replace Unsafe Functions

Replace sprintf and vsprintf with their safer counterparts snprintf and vsnprintf that include buffer size limits to prevent overflows.

3

Input Validation and Sanitization

Validate and sanitize user input to remove or escape format specifiers before using it in any string formatting operations.

Detect This Vulnerability in Your Code

Sourcery automatically identifies injection from user-controlled format string in printf-family functions and many other security issues in your codebase.