# Deserialization of Untrusted Data (CWE-502) The product deserializes untrusted data without sufficiently verifying that the resulting data will be valid. **Stack:** JavaScript - Prevalence: मध्यम 3 भाषाएँ कवर की गईं - Impact: क्रिटिकल 3 क्रिटिकल गंभीरता वाले नियम - Prevention: प्रलेखित 7 फिक्स उदाहरण **OWASP:** Software and Data Integrity Failures (A08:2021-Software and Data Integrity Failures) - #8 ## Description Many programming languages allow the serialization of objects for storage or transmission. When untrusted data is deserialized, it can lead to code execution, denial of service, or other unintended consequences. ## Prevention 2 Shoulder डिटेक्शन नियमों पर आधारित Deserialization of Untrusted Data के लिए रोकथाम रणनीतियाँ। ### JavaScript Validate training data against schemas and use content moderation before fine-tuning Use JSON.parse() instead of node-serialize, and yaml.SAFE_SCHEMA for YAML parsing ## Warning Signs - [HIGH] untrusted or unvalidated data flowing into AI/LLM fine-tuning or training processes - [CRITICAL] user input flowing to unsafe deserialization functions like node-serialize or yaml ## Consequences - अनधिकृत कोड निष्पादित करना - DoS: क्रैश / निकास / पुनः आरंभ - एप्लिकेशन डेटा संशोधित करना ## Mitigations - यदि संभव हो तो अविश्वसनीय डेटा के deserialization से बचें - यदि deserialization आवश्यक हो, तो JSON जैसे सुरक्षित प्रारूपों का उपयोग करें - डिजिटल हस्ताक्षरों जैसी अखंडता जाँचें लागू करें - Deserialization को कम विशेषाधिकार वाले वातावरण में अलग करें ## Detection - Total rules: 7 - Critical: 3 - Languages: go, javascript, typescript, python ## Rules by Language ### Javascript (2 rules) - **LLM Training Data Poisoning** [HIGH]: Detects untrusted or unvalidated data flowing into AI/LLM fine-tuning or training processes. OWASP LLM03 - Training Data Poisoning. Training data poisoning can: - Introduce backdoors into model behavior - Bias model outputs maliciously - Embed harmful content that appears in responses - Compromise model accuracy and reliability - Create security vulnerabilities in model behavior This rule detects: - User-provided data used directly in fine-tuning - External data sources used without validation - Remediation: Validate training data against schemas and use content moderation before fine-tuning. ```javascript if (!validate(trainingData)) { return res.status(400).json({ error: 'Invalid format' }); } await openai.files.create({ file: trainingData, purpose: 'fine-tune' }); ``` Learn more: https://shoulder.dev/learn/javascript/cwe-502/llm-training-data-poisoning - **Unsafe Deserialization** [CRITICAL]: Detects user input flowing to unsafe deserialization functions like node-serialize or yaml.load(). - Remediation: Use JSON.parse() instead of node-serialize, or use yaml.SAFE_SCHEMA for YAML parsing. ```javascript const data = JSON.parse(userInput); // Or for YAML: const config = yaml.load(input, { schema: yaml.SAFE_SCHEMA }); ``` Learn more: https://shoulder.dev/learn/javascript/cwe-502/unsafe-deserialization ### Typescript (2 rules) - **LLM Training Data Poisoning** [HIGH]: Detects untrusted or unvalidated data flowing into AI/LLM fine-tuning or training processes. OWASP LLM03 - Training Data Poisoning. Training data poisoning can: - Introduce backdoors into model behavior - Bias model outputs maliciously - Embed harmful content that appears in responses - Compromise model accuracy and reliability - Create security vulnerabilities in model behavior This rule detects: - User-provided data used directly in fine-tuning - External data sources used without validation - Remediation: Validate training data against schemas and use content moderation before fine-tuning. ```javascript if (!validate(trainingData)) { return res.status(400).json({ error: 'Invalid format' }); } await openai.files.create({ file: trainingData, purpose: 'fine-tune' }); ``` Learn more: https://shoulder.dev/learn/javascript/cwe-502/llm-training-data-poisoning - **Unsafe Deserialization** [CRITICAL]: Detects user input flowing to unsafe deserialization functions like node-serialize or yaml.load(). - Remediation: Use JSON.parse() instead of node-serialize, or use yaml.SAFE_SCHEMA for YAML parsing. ```javascript const data = JSON.parse(userInput); // Or for YAML: const config = yaml.load(input, { schema: yaml.SAFE_SCHEMA }); ``` Learn more: https://shoulder.dev/learn/javascript/cwe-502/unsafe-deserialization