# Improper Handling of Unicode Encoding (CWE-176) The product does not properly handle when an input contains Unicode encoding. **Stack:** JavaScript - Prevalence: Média 3 linguagens cobertas - Impact: Médio Revisão recomendada - Prevention: Documentada 3 exemplos de correção **OWASP:** Injection (A03:2021-Injection) - #3 ## Description Unicode characters can have multiple encodings or representations. If an application does not properly handle Unicode, attackers may be able to bypass security filters or cause unexpected behavior using alternate encodings. ## Prevention Estratégias de prevenção para Improper Handling of Unicode baseadas em 1 regras de detecção do Shoulder. ### JavaScript Normalize Unicode strings with NFKC before security-sensitive comparisons ## Warning Signs - [MEDIUM] missing Unicode normalization in security-sensitive string comparisons ## Consequences - Burlar mecanismo de proteção - Executar código não autorizado ## Mitigations - Normalize entradas Unicode para uma forma canônica antes de processar - Aplique verificações de segurança após a normalização Unicode - Use funções de comparação que entendam Unicode ## Detection - Total rules: 3 - Languages: go, javascript, typescript, python ## Rules by Language ### Javascript (1 rules) - **Unicode Normalization Security Issues** [MEDIUM]: Detects missing Unicode normalization in security-sensitive string comparisons. Unicode allows multiple representations of visually identical characters, which attackers can exploit to bypass input validation, authentication, or access control. Common attack vectors: - Homograph attacks (using lookalike characters): "аdmin" vs "admin" (Cyrillic 'а') - Case folding differences: "ß" (German sharp s) becomes "SS" when uppercased - Combining characters: "é" can be a single char or 'e' + combining a - Remediation: Normalize Unicode strings with NFKC before security-sensitive comparisons: ```javascript app.post('/login', (req, res) => { const username = req.body.username.normalize('NFKC').toLowerCase(); if (username === 'admin') { return res.send('Admin access'); } res.send('User access'); }); ``` Learn more: https://shoulder.dev/learn/javascript/cwe-176/unicode-normalization ### Typescript (1 rules) - **Unicode Normalization Security Issues** [MEDIUM]: Detects missing Unicode normalization in security-sensitive string comparisons. Unicode allows multiple representations of visually identical characters, which attackers can exploit to bypass input validation, authentication, or access control. Common attack vectors: - Homograph attacks (using lookalike characters): "аdmin" vs "admin" (Cyrillic 'а') - Case folding differences: "ß" (German sharp s) becomes "SS" when uppercased - Combining characters: "é" can be a single char or 'e' + combining a - Remediation: Normalize Unicode strings with NFKC before security-sensitive comparisons: ```javascript app.post('/login', (req, res) => { const username = req.body.username.normalize('NFKC').toLowerCase(); if (username === 'admin') { return res.send('Admin access'); } res.send('User access'); }); ``` Learn more: https://shoulder.dev/learn/javascript/cwe-176/unicode-normalization