[Novalug] "taint"

Peter Larsen peter@peterlarsen.org
Tue Mar 24 21:39:37 EDT 2015


On 03/24/2015 07:02 AM, Rich Kulawiec via Novalug wrote:
> On Mon, Mar 23, 2015 at 06:45:44PM -0400, Theodore Ruegsegger via Novalug wrote:
>> Turns out it's relatively easy to fix using "prepared statements" or
>> whatever your favorite DBMS uses to segregate data from code.
> There's another highly effective -- but sadly underused -- method for
> defending backend DBs and other services: a sanitizing HTTP proxy.

Since you seem to just be talking about URLs, you're not covering the
most common use case with POST forms. Most data submissions aren't done
through the URI but through http parameters.

>
> To explain: presume a static web site for a moment, so that it's possible
> to enumerate every valid URL with something like:
>
> 	find /var/www/site -type f -a -name "*.html" -print
>
> It's easy to transform that file list into a URL list like:
>
> 	http://www.example.com/index.html
> 	http://www.example.com/fred.html
> 	http://www.example.com/wilma.html
> 	http://www.example.com/foo/barney.html
> 	http://www.example.com/bar/betty.html
>
> That list can be handed to an HTTP proxy that sits in front of the real
> web server and (a) denies any request for an invalid URL and (b) optionally,
> gives a time-out to any IP address that keeps asking for invalid URLs.

I give up understanding this. What's the difference between what your
proxy does vs. what the web-server is already doing?

> This means that the real web server will never see a request that it
> can't answer.

All you've done is push what the web-server is doing to a different
server. So when you have changes, you now have to change both the proxy
and the real server, and hope you cover all functions properly on both
sides? Not sure I see the safety here.

>
> Now presume a dynamic web site.  It's not possible to enumerate every
> valid URL, however it's possible to enumerate the form of every valid
> URL via regular expressions, e.g., if:
>
> 	http://www.example.com/lookup?fred

Again, this assumes HTTP GET/PUT - which for most form submissions using
used. Instead POST is used which doesn't pass variables the user types
into the URL. You'll definitely never see long strings up there as the
URI is restricted in length to only a few thousand characters and the
encoding required for complex data is just nuts - much easier to use POST.

Of course modern sites don't even post forms anymore but use Java Script
to create dynamic pages, and direct REST calls to pass data back (again
via POST) as the user types data.

> is valid but:
>
> 	http://www.example.com/lookup?fred&barney
>
> is not then:
>
> 	http://www.example.com/lookup\?[a-z]+$
>
> stipulates that the only valid character string after a ? is one
> consisting of one or more lowercase alphabetic characters. Thus
> funny stuff like:
>
> 	http://www.example.com/lookup?fred?blah
>
> will be rejected before it even gets near the real web server.

The webserver already does this. It's part of the standard validation
routines. The injection issue raised here isn't specifically tied to the
', ; and other non typical characters. These characters may very well be
expected and allowed. The problem is that bad developers aren't passing
values to SQL but instead build SQL statements blindly. Bad developers
also forget to release memory they've used, check for null conditions or
in C doesn't use strncpy to verify buffer length before trying to copy
one buffer to another.  Bad developers also put together file names
based on user input without "thought" - and a crafty user can then get
the developer to access or even override important system files.

The problem here is bad developers. Not bad servers. You cannot filter
all potential faults that a developer can do out in a URI validation.
Bad code is bad code.

> It's not that difficult to do this: as long as the proxy's list of
> static URLs and valid regexps is kept up-to-date ("make" and "cron"
> are your friends) then it will deflect all kinds of attacks.

What when input isn't that simple? International characters, use of /
and \, < etc. may very well be valid input in names, titles and other
basic fields? Context is king here - I can make code do bad things if I
pass in too much text if the codes doesn't check the size of the input
buffer before copying it to the destination.  And that's a block of
allowable data.

Besides, as I pointed out above - you have to maintain your app context
paths in multiple places now. Imagine if your URI contains 10 variables,
some nummeric, some dates, some text, some binary codecs? And URIs can
put these fields in any order - quite randomly and still work.  Not sure
you're catching anything with this method - the core validation of input
and syntax of the URI is already done by the webserver. No need for a
proxy validation at that level. It's the context and use of the
variables that's the risky part here.

>
> And -- and this is a key point -- it will deflect attacks that *you've
> never seen before*.  As Marcus Ranum notes in
>
> 	
> 	What is "Deep Inspection"?
> 	http://www.ranum.com/security/computer_security/editorials/deepinspect/index.html
>
> 	"For example, the TIS Firewall Toolkit's HTTP proxy not only mediated
> 	the entire HTTP transaction it performed URL sanitization and did not
> 	permit unusual characters in URLs. Several of the authors' friends
> 	who still use components of the firewall toolkit were highly amused
> 	when a piece of software that was coded in 1992 successfully defeated
> 	worms in 2002 -- 10 years before the worms had been written."

Firewall appliances that are layer 7 aware is another issue. Mostly they
serve to offload a DoS attack from the backend servers. Inspection can
also look for known attack patterns - ie a client that tries lots of
different combinations one after another - something a web/app server
isn't well equipted to do. But validating a single URI isn't done that
often. I've seen more basic scanning for context roots - again from a
security perspective - to ensure only app related requests are sent to
the app server. But they don't look at the stuff after the context root
name - since that changes per deployment. And security people don't like
to change their stuff that often ;)

> There's a lot to be said for interposing firewalls that use packet inspection
> *and* proxies in between every server in web service architectures.  Yes, it's
> more hardware.  Yes, it's more software.  But it allows you to be extremely
> strict about what's traversing your network and, if you do it right, it makes
> things very difficult for attackers.  It's also not hard to maintain if you
> know how to use basic tools like make, sed, awk, rsync, etc.

In this case you aren't addressing the real issue and trying to patch
around it doesn't make the vulnerabilities go away.  The security
appliances I'm aware of are meant to offload complex rules looking for
attack patterns. Not repeating basic application validation. To me
SELinux comes in play here as it already blocks in the httpd server's
abilities so in case bad code is applied, it's not allowed to read files
and other resources if code is done bad and tries to break those bonds.
The app/web server already does core validation of input before passing
it on to code.


-- 
Regards
  Peter Larsen






More information about the Novalug mailing list