Fx{r} is trying to start the Fx{r} Community! Please join our group on Adobe Groups following this link: http://groups.adobe.com/groups/ab29539ab9.
Fx{r} is now on Twitter too. Follow us @ twitter.com/fx_r!
«
»

ActionScript, How to

Strip Html Tags – with allowable tags

Andrei Ionescu | 08.04.08 | 22 Comments

Google Buzz

Lately I’ve been working with Rich Text Editor so I found that I need more than what RTE offers. Recently I needed a function to strip HTML tags. If you have a PHP background you can remember the strip_tags function from PHP4 and PHP5 where you could strip all HTML tags but the ones you specified as allowable tags.

My function receives two string parameters:
 - html: the string with the HTML
 - tags: a string with allowable tags separated by comma

In a few words I’ll explain what it does step by step:

  1. split the allowable tags string by any kind of comma separation
  2. remove empty tags or spaces
  3. search for tags in the HTML string
  4. if found add it to “to be removed array”
  5. remove the tags from HTML string
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
public static function stripHtmlTags(html:String, tags:String = ""):String
{
    var tagsToBeKept:Array = new Array();
    if (tags.length > 0)
        tagsToBeKept = tags.split(new RegExp("\\s*,\\s*"));
 
    var tagsToKeep:Array = new Array();
    for (var i:int = 0; i < tagsToBeKept.length; i++)
    {
        if (tagsToBeKept[i] != null && tagsToBeKept[i] != "")
            tagsToKeep.push(tagsToBeKept[i]);
    }
 
    var toBeRemoved:Array = new Array();
    var tagRegExp:RegExp = new RegExp("<([^>\\s]+)(\\s[^>]+)*>", "g");
 
    var foundedStrings:Array = html.match(tagRegExp);
    for (i = 0; i < foundedStrings.length; i++) 
    {
        var tagFlag:Boolean = false;
        if (tagsToKeep != null) 
        {
            for (var j:int = 0; j < tagsToKeep.length; j++)
            {
                var tmpRegExp:RegExp = new RegExp("<\/?" + tagsToKeep[j] + "( [^<>]*)*>", "i");
                var tmpStr:String = foundedStrings[i] as String;
                if (tmpStr.search(tmpRegExp) != -1) 
                    tagFlag = true;
            }
        }
        if (!tagFlag)
            toBeRemoved.push(foundedStrings[i]);
    }
    for (i = 0; i < toBeRemoved.length; i++) 
    {
        var tmpRE:RegExp = new RegExp("([\+\*\$\/])","g");
        var tmpRemRE:RegExp = new RegExp((toBeRemoved[i] as String).replace(tmpRE, "\\$1"),"g");
        html = html.replace(tmpRemRE, "");
    } 
    return html;
}

Usage:

stripHtmlTags(myInput.htmlText, "b, i, u")

This will remove all tags that are not bold, italic or underline. See the following working example.

Share and Enjoy:
  • Twitter
  • Google Buzz
  • LinkedIn
  • Google Bookmarks
  • del.icio.us
  • Digg
  • Sphinn
  • blogmarks
  • Reddit
  • StumbleUpon
  • Facebook
  • DZone
  • FriendFeed
  • Yahoo! Buzz
  • Yahoo! Bookmarks
  • Slashdot
  • MySpace
  • Add to favorites




Tags: , , ,

This post was written by Andrei Ionescu

Views: 11771

related

22 Comments

have your say

Add your comment below, or trackback from your own site. Subscribe to these comments.

Be nice. Keep it clean. Stay on topic. No spam.

You can use these tags:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" line="" escaped="">

:

:


«
»