# Improper Control of Generation of Code ('Code Injection') (CWE-94) The product constructs all or part of a code segment using externally-influenced input from an upstream component, but it does not neutralize or incorrectly neutralizes special elements that could modify the syntax or behavior of the intended code segment. - Prevalence: High Frequently exploited - Impact: Critical 6 critical-severity rules - Prevention: Documented 10 fix examples **OWASP:** Injection (A03:2021-Injection) - #3 ## Description When software allows a user's input to contain code syntax, it might be possible for an attacker to craft the code in such a way that it will alter the intended control flow of the software. Such an alteration could lead to arbitrary code execution. ## Prevention Prevention strategies for Code Injection based on 10 Shoulder detection rules. ### Key Practices - treated as untrusted input since: - Prompt injection attacks can manipulate AI responses - LLMs can hallucinate and produce unexpected outputs - Model behavior may change between versions Dangerous operations include: - Code execution (eval, Function, vm - avoided or heavily restricted - treated as untrusted input since: - Prompt injection attacks can manipulate AI responses - LLMs can hallucinate and produce unexpected outputs - Model behavior may change between versions Dangerous operations include: - Code execution (eval, exec, compile) - Command execution (os ### Go Pass user input as template data, never use template.HTML with unsanitized input Validate and sanitize LLM outputs before using in dangerous operations like exec or SQL Use predefined templates and pass user input as template data, never as template code ### Node.js Replace eval/Function constructor with safe alternatives like JSON.parse or predefined function maps Use static values for decorator parameters and avoid eval(), global modifications, or user input in decorators ### Python Use ast.literal_eval() for safe evaluation or avoid eval/exec entirely Replace eval/exec with ast.literal_eval, JSON parsing, or subprocess with shell=False Validate and sanitize LLM outputs with Pydantic before using in dangerous operations like eval, exec, or SQL ## Warning Signs - [HIGH] LLM output flows to ... without validation - [HIGH] LLM outputs used directly in dangerous operations like command execution or SQL queries without vali - [HIGH] LLM/AI outputs being used directly in dangerous operations without proper validation or sanitization - [HIGH] Decorator '...' executes unsafe code or accesses global state. This can lead to code injection or unauthorized access. - [CRITICAL] user input flowing to template functions that bypass HTML escaping - [CRITICAL] user input flowing to code execution functions like eval() or Function constructor - [CRITICAL] untrusted user input flowing into code evaluation functions (eval, exec, compile) - [CRITICAL] usage of dangerous Python functions that can lead to arbitrary code execution: eval(), exec(), compi ## Consequences - Execute Unauthorized Code - Read Application Data - Modify Application Data ## Mitigations - Refactor code to avoid use of eval() or equivalent functions - Run code in a sandbox that enforces strict boundaries - Use static type checking where possible ## Detection - Total rules: 10 - Critical: 6 - Languages: go, javascript, typescript, python ## Rules by Language ### Python (4 rules) - **Code Injection via eval/exec** [CRITICAL]: Detects untrusted user input flowing into code evaluation functions (eval, exec, compile). - Remediation: Use ast.literal_eval() for safe evaluation of literals. ```python import ast parsed = ast.literal_eval(user_input) ``` Learn more: https://shoulder.dev/learn/python/cwe-94/code-injection - **Dangerous Function Usage** [CRITICAL]: Detects usage of dangerous Python functions that can lead to arbitrary code execution: eval(), exec(), compile(), __import__() with user input, or pickle deserialization. These should be avoided or heavily restricted. - Remediation: Use ast.literal_eval() for safe literal evaluation, JSON for serialization, and subprocess with shell=False. ```python import ast import json import subprocess # Safe literal evaluation (numbers, strings, lists, dicts only) result = ast.literal_eval(user_input) # Safe serialization (use JSON instead of pickle) data = json.loads(user_input) # Safe subprocess (use argument list, not shell) subprocess.run(['ping', '-c', '1', host], shell=False, timeout=30) ``` Learn more: https://shoulder.dev/learn/python/cwe-94/dangerous-functions - **LLM Insecure Output Handling** [HIGH]: Detects LLM/AI outputs being used directly in dangerous operations without proper validation or sanitization. OWASP LLM02 - Insecure Output Handling. LLM outputs should be treated as untrusted input since: - Prompt injection attacks can manipulate AI responses - LLMs can hallucinate and produce unexpected outputs - Model behavior may change between versions Dangerous operations include: - Code execution (eval, exec, compile) - Command execution (os.system, subprocess) - SQL queries (cursor.execute, raw queries) - Template rendering (Jinja2, Django templates) - File operations (open, write, unlink) - Deserialization (pickle, yaml.load) - Remediation: Validate LLM outputs with Pydantic before using in sensitive operations. ```python from pydantic import BaseModel, validator import re class SearchResponse(BaseModel): terms: list[str] @validator('terms', each_item=True) def validate_term(cls, v): if not re.match(r'^[a-zA-Z0-9\s]+$', v): raise ValueError('Invalid search term') return v validated = SearchResponse.parse_raw(response.choices[0].message.content) ``` Learn more: https://shoulder.dev/learn/python/cwe-94/llm-insecure-output-handling - **Server-Side Template Injection (SSTI)** [CRITICAL]: Detects user input used directly in template rendering, allowing arbitrary code execution. - Remediation: Use template files with render_template(), not render_template_string(). ```python return render_template('page.html', name=user_name) ``` Learn more: https://shoulder.dev/learn/python/cwe-94/ssti ### Go (3 rules) - **Code Injection via os/exec** [CRITICAL]: Detects user input flowing to template functions that bypass HTML escaping. - Remediation: Pass user input as template data instead of using template.HTML. ```go data := struct{ Content string }{Content: userInput} tmpl.Execute(w, data) ``` Learn more: https://shoulder.dev/learn/go/cwe-94/code-injection - **LLM Insecure Output Handling** [HIGH]: Detects LLM outputs used directly in dangerous operations like command execution or SQL queries without validation. - Remediation: Validate LLM outputs against an allowlist before using in dangerous operations. ```go if !validCommands[output] { return errors.New("invalid command") } ``` Learn more: https://shoulder.dev/learn/go/cwe-94/llm-insecure-output-handling - **Server-Side Template Injection** [CRITICAL]: User input passed directly to template.Parse without sanitization. - Remediation: Use predefined templates and pass user data as template variables. ```go tmpl := template.Must(template.ParseFiles("page.html")) tmpl.Execute(w, map[string]string{ "name": userInput, // Safe - passed as data, not template code }) ``` Learn more: https://shoulder.dev/learn/go/cwe-94/ssti ### Typescript (3 rules) - **Code Injection via eval() and Function constructor** [CRITICAL]: Detects user input flowing to code execution functions like eval() or Function constructor. - Remediation: Use JSON.parse for data or predefined function maps instead of eval(). ```javascript const data = JSON.parse(userInput); // Or use a function map const ops = { add: (a,b) => a+b }; ops[action]?.(x, y); ``` Learn more: https://shoulder.dev/learn/javascript/cwe-94/code-injection - **LLM Insecure Output Handling** [HIGH]: Detects LLM/AI outputs being used directly in dangerous operations without proper validation or sanitization. OWASP LLM02 - Insecure Output Handling. LLM outputs should be treated as untrusted input since: - Prompt injection attacks can manipulate AI responses - LLMs can hallucinate and produce unexpected outputs - Model behavior may change between versions Dangerous operations include: - Code execution (eval, Function, vm.runInContext) - Command execution (exec, spawn, execSync) - SQL queries (database operations) - HTML rendering (innerHTML, document.write) - File operations (writeFile, unlink) - Network requests (fetch, axios with LLM-generated URLs) - Remediation: Validate LLM outputs against expected formats before using in dangerous operations. ```javascript const content = response.choices[0].message.content; if (!/^[a-zA-Z0-9\s]+$/.test(content)) { throw new Error('Invalid format'); } ``` Learn more: https://shoulder.dev/learn/javascript/cwe-94/llm-insecure-output-handling - **TypeScript Unsafe Decorator Usage** [HIGH]: Decorators that use eval(), modify global state, or accept user input as parameters enable code injection, prototype pollution, and authorization bypass. - Remediation: Use static values for decorator parameters and avoid eval/global modifications. ```typescript enum Role { Admin = 'admin', User = 'user' } function RequireRole(...roles: Role[]) { return function(target: any, key: string, desc: PropertyDescriptor) { const original = desc.value; desc.value = function(...args: any[]) { if (!roles.includes(this.user?.role)) { throw new Error('Unauthorized'); } return original.apply(this, args); }; }; } ``` Learn more: https://shoulder.dev/learn/typescript/cwe-94/unsafe-decorator ### Javascript (2 rules) - **Code Injection via eval() and Function constructor** [CRITICAL]: Detects user input flowing to code execution functions like eval() or Function constructor. - Remediation: Use JSON.parse for data or predefined function maps instead of eval(). ```javascript const data = JSON.parse(userInput); // Or use a function map const ops = { add: (a,b) => a+b }; ops[action]?.(x, y); ``` Learn more: https://shoulder.dev/learn/javascript/cwe-94/code-injection - **LLM Insecure Output Handling** [HIGH]: Detects LLM/AI outputs being used directly in dangerous operations without proper validation or sanitization. OWASP LLM02 - Insecure Output Handling. LLM outputs should be treated as untrusted input since: - Prompt injection attacks can manipulate AI responses - LLMs can hallucinate and produce unexpected outputs - Model behavior may change between versions Dangerous operations include: - Code execution (eval, Function, vm.runInContext) - Command execution (exec, spawn, execSync) - SQL queries (database operations) - HTML rendering (innerHTML, document.write) - File operations (writeFile, unlink) - Network requests (fetch, axios with LLM-generated URLs) - Remediation: Validate LLM outputs against expected formats before using in dangerous operations. ```javascript const content = response.choices[0].message.content; if (!/^[a-zA-Z0-9\s]+$/.test(content)) { throw new Error('Invalid format'); } ``` Learn more: https://shoulder.dev/learn/javascript/cwe-94/llm-insecure-output-handling