score:0

As 4e6 says the problem lies within the standard configuration of Java. Which assumes all files encoded in Latin1.

1.val Matcher = """.+/(.*)""".r
2.val Matcher(title) = """http://en.wikipedia.org/wiki/Château_La_Louvière"""

This could be fixed by setting the following java-OPTS

export JAVA_OPTS='-Dfile.encoding=UTF-8'

Still 1. and 2. will work, even if you don't change the encoding. The Problem lies in 3. and 4. .

3.val lowerCase = title.toLower
4.if(lowercase.equals("château_la_louvière")) //do something

''toLower'' will cause the test in 4. to fail , because "â" and "è" will be interpreted wrongly. These characters would be encoded as two up to four bytes, which each will be lowercased independently thus yielding a completely different result as ''château_la_louvière'' .


Related Query

More Query from same tag