optimize handling of expensive layer initialization #798

springmeyer · 2011-10-17T22:26:56Z

A common scenario in TileMill with postgis is very slow layer creation. This ticket is not about the cause but the impact of it, which we can mitigate in TileMill to improve performance overall.

An illustrative scenario is that the user is trying to visualize a table for the first time that contains millions of records. Mapnik expects the extent to be passed as an option to avoid the costly lookup of calling ST_Extent on the table, but this does not happen.

The impact of this (or any scenario where a layer takes a long time to create) is that when the Layer UI is saved the layer is created once, then when the css style is saved, it is created again, and when rendering starts and load increases, the pool of maps is increased by tilelive and the layer is created again (up to 5 times I think which is the max pools size in tilelive-mapnik).
#1 - Ideally the layer could be created once and its instance passed to tilelive-mapnik, but this is currently not feasible because all layer initialization is done through XML. (so tilelive-mapnik would need a way to just load the mss, and directly pass the layer instance to node-mapnik rather than re-reading the mml layer json).
#2 - And ideally new maps that are loaded could be cloned (rather than reloaded from XML) in an intelligent way such that layer instances would be unique (for proper parallel rendering that is thread safe), but expensive intialization steps could be skipped (this would be an upstream mapnik issue).

The text was updated successfully, but these errors were encountered:

springmeyer · 2011-10-26T00:43:43Z

Couple additional notes:

Because I went with in-memory parsing for csv files this issue impacts tilemill usability severely for large CSV files. A 55,000 record CSV takes ~ 5 seconds to parse, so having that 5 second hang hit us every time one of the tilelive rendering threads wakes up hurts. This hopefully will be less painful once I add a lazy parsing mode upstream in mapnik however: lazy parsing mode for csv plugin mapnik/mapnik#919
Overall fixing this issue potentially spans a lot of code in TileMill and potentially other parts like tilelive and node-mapnik (which might need fixes to enable this) which is why a top level issue here makes sense against tilemill (until we have sub issues).

yhahn · 2011-12-13T15:12:07Z

Hrm, very interesting. If I understand correctly the idea is to have the Pool in tilelive-mapnik clone the mapnik map object from its first created instance rather than create new ones (up to 5).

Is there an exposed way to clone a mapnik map object atm? Or is this something you're thinking of adding to mapnik/node-mapnik?

springmeyer · 2011-12-14T02:02:52Z

@yhahn - exactly, that is the goal. We just need to vet this upstream in mapnik and potentially refactor the datasource impl's a bit to make this work right, ultimately landing as an api call in node-mapnik (I imagine). @artemp is looking into this.

springmeyer mentioned this issue Aug 28, 2012

Provide a way to auto-calc and cache postgis layer extents #1634

Closed

springmeyer mentioned this issue Jun 4, 2012

lazy parsing mode for csv plugin mapnik/mapnik#919

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

optimize handling of expensive layer initialization #798

optimize handling of expensive layer initialization #798

springmeyer commented Oct 17, 2011

springmeyer commented Oct 26, 2011

yhahn commented Dec 13, 2011

springmeyer commented Dec 14, 2011

optimize handling of expensive layer initialization #798

optimize handling of expensive layer initialization #798

Comments

springmeyer commented Oct 17, 2011

springmeyer commented Oct 26, 2011

yhahn commented Dec 13, 2011

springmeyer commented Dec 14, 2011