Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize handling of expensive layer initialization #798

Open
springmeyer opened this issue Oct 17, 2011 · 3 comments
Open

optimize handling of expensive layer initialization #798

springmeyer opened this issue Oct 17, 2011 · 3 comments
Labels
Milestone

Comments

@springmeyer
Copy link
Member

A common scenario in TileMill with postgis is very slow layer creation. This ticket is not about the cause but the impact of it, which we can mitigate in TileMill to improve performance overall.

An illustrative scenario is that the user is trying to visualize a table for the first time that contains millions of records. Mapnik expects the extent to be passed as an option to avoid the costly lookup of calling ST_Extent on the table, but this does not happen.

The impact of this (or any scenario where a layer takes a long time to create) is that when the Layer UI is saved the layer is created once, then when the css style is saved, it is created again, and when rendering starts and load increases, the pool of maps is increased by tilelive and the layer is created again (up to 5 times I think which is the max pools size in tilelive-mapnik).
#1 - Ideally the layer could be created once and its instance passed to tilelive-mapnik, but this is currently not feasible because all layer initialization is done through XML. (so tilelive-mapnik would need a way to just load the mss, and directly pass the layer instance to node-mapnik rather than re-reading the mml layer json).
#2 - And ideally new maps that are loaded could be cloned (rather than reloaded from XML) in an intelligent way such that layer instances would be unique (for proper parallel rendering that is thread safe), but expensive intialization steps could be skipped (this would be an upstream mapnik issue).

@springmeyer
Copy link
Member Author

Couple additional notes:

  1. Because I went with in-memory parsing for csv files this issue impacts tilemill usability severely for large CSV files. A 55,000 record CSV takes ~ 5 seconds to parse, so having that 5 second hang hit us every time one of the tilelive rendering threads wakes up hurts. This hopefully will be less painful once I add a lazy parsing mode upstream in mapnik however: lazy parsing mode for csv plugin mapnik/mapnik#919

  2. Overall fixing this issue potentially spans a lot of code in TileMill and potentially other parts like tilelive and node-mapnik (which might need fixes to enable this) which is why a top level issue here makes sense against tilemill (until we have sub issues).

@yhahn
Copy link
Contributor

yhahn commented Dec 13, 2011

Hrm, very interesting. If I understand correctly the idea is to have the Pool in tilelive-mapnik clone the mapnik map object from its first created instance rather than create new ones (up to 5).

Is there an exposed way to clone a mapnik map object atm? Or is this something you're thinking of adding to mapnik/node-mapnik?

@springmeyer
Copy link
Member Author

@yhahn - exactly, that is the goal. We just need to vet this upstream in mapnik and potentially refactor the datasource impl's a bit to make this work right, ultimately landing as an api call in node-mapnik (I imagine). @artemp is looking into this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2 participants