Table of contents
- Remove Duplicate Chars from a String
- Comparison: Remove Duplicate Chars from a String Via Array and RegExp
This post is about how to remove duplicate characters from a string. And what I mean is that I want to make every char unique.
Ex: “aabbbcccaadddee” will become “abcde“.
In “short words” the following steps are taken:
- Split the string to an array (source array)
- Create a second empty array which will retain the unique values
- Sort the source array with case insensitive
- Store current value for comparison
- Walk through the whole source array and for each character check if is the same with current value variable – if it is that character is not already added we add it to the second array (this because the source array is sorted and there cannot be an array like this after sorting:
array("a","b","a")
but only sorted… like this:
array("a","a","b")
- Join back the second array to a string
These are the steps and here is a function in Actionscript 3:
private function removeDuplicates(string:String):String { var arr:Array = string.split(''); var currentValue:String = ""; var tempArray:Array = new Array(); arr.sort(Array.CASEINSENSITIVE); arr.forEach( function(item:*,index:uint,array:Array):void { if (currentValue != item) { tempArray.push(item); currentValue = item; } } ); return tempArray.sort(Array.CASEINSENSITIVE).join(''); }
I tried to have the same result using regular expressions but no success. And I stress that I meant about removing duplicates and making a unique chars string. Removing only duplicates is easy and I can put a regular expression here for that but that is not the purpose of this article.
If anyone resolved this using RegExp please share
.
Tags: ActionScript, Regexp, sorting, string
This post was written by Andrei Ionescu
Views: 11570










[...] for. In fact, while searching for “php remove repeating character in string,” the first article listed in Google included a note at the bottom stating: “I tried to have the same result [...]
Hello. Great post. It lead me to find the regex solution I was looking for. I mentioned your post as well as a possible regex solution to what you were looking for here:
http://www.ilovebonnie.net/2008/04/09/php-and-regex-how-to-replace-or-remove-a-repeating-character/
I tried to get the same result with regular expressions but no success, not because I don’t know regular expressions (which I do very well), but because there is no easy way of doing it. There are some tries to do this using lookback expressions but it needs that every character in the string to be separated by a separator (ex: comma, pipe, etc). For this task there are needed at least two steps to accomplish it. The beauty of using regular expressions is this: a lot of steps that are needed to accomplish some string formating/parsing can be done in one step using RegExp. But if there are the same amount of steps it will be faster not to use RegExp.
What you accomplished in your article is to remove duplicates but not to make them unique. For example: “aabbccaaaaddee” will become “abcade“. What I did is making them unique.
Thanks for your comment… because of it I edited a bit the article to make it more clear about its purpose.
Now that I further understand what you are looking for, I have taken on the challenge of trying to satisfy your needs.
Hopefully you will find this helpful:
http://www.ilovebonnie.net/2008/04/10/php-and-regex-replacing-repeating-characters-with-single-characters-in-a-string/
Still not satisfied. Although you used Regular expressions there is a loop as I have a loop. But your loop executes two RegExp whilst mine executes only simple comparison and a simple push to an array. Everyone can see very easily that not using RegExp to accomplish this will be faster and easier.
I want to mention again that “the beauty of using regular expressions is this: a lot of steps that are needed to accomplish some string formating/parsing can be done in one step using RegExp. But if there are the same amount of steps it will be faster not to use RegExp.“
Hey, just thought I’d let you know that I posted a better (php) solution at the other blog post.
http://www.ilovebonnie.net/2008/04/10/php-and-regex-replacing-repeating-characters-with-single-characters-in-a-string/
Here’s the same thing in AS3 (I’m quite terrible at AS3, so I didn’t think I’d have any luck figuring it out)
var myPattern:RegExp = /(.)\1+/gi;var str:String = "aaabbbcccddeeff";
trace(str.replace(myPattern, '$1'));
TLP thanks but is NOT what I needed.
So what I needed is a string like “aabbccaaaaaddee” to be transformed to “abcde” (please note that there are two groups of “a“s in the first string and only one “a” in the resulting group).
Using “aabbccaaaaaddee” in your regular expression will give us the result “abcade” which is different from “abcde” (note the underlined letters). I need unique chars in the resulting string.
I tried with regular expression but there is no easy way (one step).
Thanks also for benchmarks. Great job.
Ah, terribly sorry about the misunderstanding. I really should read better before making a fool of myself.
Anyway, I came up with this little gem that’s a play off my previous code.
function removeDuplicates(string:String):String {return string.split('').sort().join('').replace(/(.)\1+/gi,'$1');
}
Benchmark on 10,000 runs in AS3
Yours from the post above: 1.0310000000000024 seconds
Mine from this comment: 0.48499999999999943 seconds
Once again, apologies for the misunderstanding and hopefully I didn’t mess it up again.
This should help.
Worked for me.
[...] In my previous article TLP has commented and created a better method. You can see his post here. [...]
Thanks Dave! A really nice algorithm at the first view. The problem with it is that is not working properly. Try to use it with “aabbccaaaaddaaeeff” which should be transformed to “abcdef“. But is not doing what it should because when you do splice on the array you are looping the end condition of the loop changes because the length changes and also the position pointer inside the array changes. Everything is just getting messed up. Thanks for the idea and maybe you’ll work out on this algorithm but not looping on and modifying the same array.
A new post has arrived here. Is a comparison between the methods to get a string with unique chars. Thanks TLP for sharing your solution.
I’m not very familiar with actionscript, I stumbled upon your site while looking for a reminder on the regex syntax for look-behind in vim and was just a bit fascinated by this problem. I’m just curious, is there a way to transpose an AS array with a native function, so that the values become keys (like in a hash table / associative array)? I assume that in AS like in most other languages, array keys are unique. If it is possible maybe the language native constructs could do these operations faster :
1- split the string in an array
2- transpose the array (which should automatically eliminate duplicate keys).
3- gather all the keys and you have your unique values.
Based on the same concept, if you can’t transpose the array, you could just do a single loop through it and assign its values as keys and values to another array:
(reminder i don’t know AS, this code is based on what i’ve seen above, try to decipher the logic)
var firstArray:Array = thestring.split('');
var secondArray:Array = new Array();
firstArray.forEach(
function (){
secondArray[item] = item;
}
);
Reorder the second array after the operation, since it’s shorter the reordering operation is more efficient.
There’s no guarantee that the whole thing will be faster, if you have the time to benchmark it, maybe it’s worth a shot. Off I go.
Thanks Michael for your idea. It sounds good and I’ll benchmark it. As far as I know in ActionScript is no direct method or function that will inverse/transpose an array but is worth trying to implement the loop as you explained in your comment. Thanks!
PERL SCRIPT:
OUTPUT:
abcdef