When programming against any LDAP backend, it’s good to sanitize any user input that may go into a search filter. A typical case is authentication (login or single sing-on) applications, where an input username or email must be used to resolve a user’s distinct name (DN) in the LDAP directory.

Consider for example the following search filter template, which is used to get a user based on finding an exact match of their employeenameor mail attribute.

$filter = "(&(cn=$employeename))";

On some systems, you need to escape () with a \.

$filter = "(&(cn=James \(Jim\) Doe))";

To prevent this from happening you may limit the range of acceptable input characters or you may use a function that sanitizes the input by escaping all special characters in the assertion value. The special search filter characters and how to escape them is specified in detail in RFC 4515 (LDAP: String Representation of Search Filters).

Here is one such sanitising method, written in C#:

static string EscapeSpecialCharacters(string ad_filter)
            //remove break lines
            ad_filter = ad_filter.Replace("\n", "").Replace("\r", "");
            String escaped_Fitler = "";
            var charArray = ad_filter.ToCharArray();
            for (int i = 0; i < charArray.Length; i++)

                char c = charArray[i];

if (c == '*')
                    // escape asterisk
                    escaped_Fitler += "\\2a";
else if (c == '(')
                    // escape left parenthesis
                    escaped_Fitler += "\\28";
else if (c == ')')
                    // escape right parenthesis
                    escaped_Fitler += "\\29";
else if (c == '\\')
                    // escape backslash
                    escaped_Fitler += "\\5c";
else if (c == '\u0000')
                    // escape NULL char
                    escaped_Fitler += "\\00";
                else if (c <= 0x7f)                  { // regular 1-byte UTF-8 char                      escaped_Fitler += c.ToString();                  }                  else if (c >= 0x080)

                    // higher-order 2, 3 and 4-byte UTF-8 chars
                    //UTF-8 is a variable-width encoding, with each character represented by one to four bytes.
                    //If the character is encoded by just one byte, the high-order bit is 0 and the other bits give 
                    //the code value (in the range 0..127).

                        var encoding = Encoding.GetEncoding(1252);
                        var data = new[] { (byte)c };
                        string text = encoding.GetString(data);
                        var roundTripped = encoding.GetBytes(text);

                        if (!roundTripped.SequenceEqual(data))
                            Log.Error("Rount Trip Failed At character: " + c);
                            return "";

                    catch (Exception e)
                        Log.Error(e,"Could not convert char to string" );
            return escaped_Fitler;

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.