-
Notifications
You must be signed in to change notification settings - Fork 526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error parsing KML with accented characters non-local files #864
Comments
@springmeyer this looks like an upstream. Since you're an OGR dev, maybe take a look? |
The problem appears to be that the file is not utf8 encoded. In fact it is iso-8859-1 (according to the detection of the unix
So, ogr trusts the declared character encoding at the top of the KML (
If I reencode the file as utf8 then it works fine:
|
This is a very common problem, and I've solved it in the past by forcing utf-8 encoding after detecting the likely encoding using http://pypi.python.org/pypi/chardet. We'll have to look for something similar in nodejs. Until then the only way I can think of handling this is going back to the fusion tables source and changing the character to use the utf-8 encoded value. If you don't know how to do this then perhaps contact the fusion tables support. |
I'm investigating a similar problem right now with the KMZ containing the World Heritage sites. I think contributing to why the problem is not getting fixed at source is the fact that Google Earth appears to ignore the declared XML encoding and interpret the characters as Latin-1. I'm not sure if it's hardwired for Latin-1 or if it's using some type of sniffer to guess. |
I get the following error:
Error: OGR Plugin: XML parsing of KML file failed : not well-formed (invalid token) at line 19, column 26
from this KML:
When loading from this link:
https://www.google.com/fusiontables/exporttable?query=select+col5+from+1612141+&o=kml&g=col5
However if I load this file locally there is no error.
The character at line 19, column 26 is the
ñ
inPuño
. Changing this to a non-accentedn
allows network import with no errors.The text was updated successfully, but these errors were encountered: