javascript regex test() not recognising danish characters -


i have made search-function, facebooks searchfield autocomplete, using javascript , regex. works fine, when search danish letters Æ, Ø, Å, .test() function won't recognize , nothing returned.

this how search part working:

var regxsearch = "\\b"+sterm; //sterm value of search field        var regx = new regexp(regxsearch,"gi"); var namecheck = regx.test(users[i]["user"]["name"]); 

imagine usernames asbjørn, østergård , jason:

  • if search "asbjør", "asb" or "ørn" return true.
  • if search "øster" or "østergård" return false.
  • if search "stergård", "ård" or "rd" return true
  • if search "j", "jas", "jaso" etc return true
  • if search "ason" or "son" return false

i found fiddle able search æøå, works when search entire word. i'm not enough decode how works, maybe can use find possible fix problem: http://jsfiddle.net/8y3cm/17/

is fixable or need switch kind of plugin search-function?

your problem twofold.

first: \b matches word break on position. word break matches when on 1 of sides have word character , on other side not word character. regex starts "\\b"+sterm , fail jason on \bason , \bson, match on \bj, \bjas , \bjaso. if there 'nothing' on left of \b counts 'not word character' (there no word character, see? :), , in fail cases there is there, while in match cases there not.

second: characters ø , å not considered "word characters" in javascript, simple test show you:

alert ("østergård".match(/\w+/g)); 

since not considered word character, behavior of \b reverse of think does:

alert ("østergård".match(/\bøster/)); // null 

it fails because \b sees not-word character on right (the ø) , should match word character on left (it doesn't, there nothing there).

a small test suite sample cases:

var sterm = [     [ "asbjørn", "asbjør", "asb", "ørn" ],     [ "østergård", "øster", "østergård" ],     [ "østergård", "stergård", "ård", "rd" ],     [ "jason", "j", "jas", "jaso" ],     [ "jason", "ason", "son" ]     ];  var r = ''; (s=0; s<sterm.length; s++) {     (s2=1; s2<sterm[s].length; s2++)     {         var regxsearch = "\\b"+sterm[s][s2];         //sterm value of search field                var regx = new regexp(regxsearch,"gi");          var namecheck = regx.test(sterm[s][0]);         r += "["+sterm[s][s2]+"] on ["+sterm[s][0]+"] "+namecheck+'\r';     } } alert (r); 

shows same order of true , false reported. if remove \b in regxsearch see all return true.

why own 'temporary fix' fix it?

you replace non-word characters (nothing personal, according javascript!) valid word characters, , expected behavior of \b back.

a better fix not rely on specific behavior of \b (and, extension, \w). if these user names may appear anywhere in text (so not @ beginning of string), can use this:

var regxsearch = "(^|[^\\wøå])"+sterm; 

where regex

(^|[^\\wøå]) 

stands for

  • ^ beginning of string
  • | or
  • [^...] not (^) of characters \w, ø, å

Comments

Popular posts from this blog

javascript - Jquery show_hide, what to add in order to make the page scroll to the bottom of the hidden field once button is clicked -

python - Django-cities exits with "killed" -

python - How to get a widget position inside it's layout in Kivy? -