Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Centralise XYZ services? #153

Closed
darribas opened this issue May 26, 2020 · 20 comments
Closed

Centralise XYZ services? #153

darribas opened this issue May 26, 2020 · 20 comments

Comments

@darribas
Copy link
Collaborator

Playing with xarray-leaflet I realised ipyleaflet supports a similar set of providers than us and seems to hand-code the providers in a similar way to how we do it:

https://github.com/jupyter-widgets/ipyleaflet/blob/master/ipyleaflet/basemaps.py

This has got me thinking if there's scope to join forces into a small library (xyzervices?) that parses a large amount of XYZ providers in a way that is useful to the Python data community. From the looks of it, other libraries could also use it and we'd only need to edit providers in one place for everyone to benefit.

Would this make sense? @jorisvandenbossche, @SylvainCorlay, @davidbrochart

@davidbrochart
Copy link

@davidbrochart
Copy link

Indeed, thanks for pointing that out.
We use traitlets's Bunch, but we could have a no-dependency structure in this new library and transform it in ipyleaflet.

@martinRenou
Copy link

martinRenou commented May 26, 2020

I love the idea. Thanks for bringing that up.

We use traitlets's Bunch, but we could have a no-dependency structure in this new library and transform it in ipyleaflet.

What's really nice about the Bunch data structure is that, unlike a simple dictionary, it provides completion in IPython so people can quickly look at what's available inside the structure. But we can definitely re-implement this data structure in this new library, it's not too much code duplication (see https://github.com/ipython/traitlets/blob/ee1a829fafad654601dc851dc1d4ec79da090913/traitlets/utils/bunch.py).

@jorisvandenbossche
Copy link
Member

When I wrote the new providers module in contextily, it was actually inspired on ipyleaflet (I know I had the same thought about potentially sharing this as well, but forgot to open an issue about it back then, thanks Dani!).

So if you look at the providers code in contextily (https://github.com/geopandas/contextily/blob/master/contextily/_providers.py), you will see it also uses a "Bunch". Which is basically the same as the one of traitlets, but hardcoded (so indeed adding attribute access to the dictionary, and in addition I also added __call__ to pass custom values to the actual Providers):

class Bunch(dict):
"""A dict with attribute-access"""
def __getattr__(self, key):
try:
return self.__getitem__(key)
except KeyError:
raise AttributeError(key)
def __dir__(self):
return self.keys()
class TileProvider(Bunch):
"""
A dict with attribute-access and that
can be called to update keys
"""
def __call__(self, **kwargs):
new = TileProvider(self) # takes a copy preserving the class
new.update(kwargs)
return new

@martinRenou
Copy link

martinRenou commented May 26, 2020

Nice. It would definitely make sense to share this.

@darribas
Copy link
Collaborator Author

Nice! What would be a good approach? A few questions to think through:

  • My sense is ideally we'd set a separate mini package (xyzservices or something like that) which we can all "consume" from
  • If so, I haven't looked in detail, but what'd be the specs/API be required for a service?
  • Initially, we could start with the list of leaflet providers we have in ipyleaflet/contextily (I think right now it's the same)
  • Going forward, another benefit of a package like this would be to be able to take arbitrary contributions from other services, so we are not limited to what leaflet offers. As long as we had a clear spec of the format expected, it should be easy to provide a structure that allows to merge an automated list from leaflet and a manual one community contributed, and expose the two as one big list to the end-user
  • Also, where should this package live?

Does this sound sensible? Maybe let's flesh out more details on this issue so, when somebody has bandwith, it's easy to pick up coding?

@martinRenou
Copy link

If so, I haven't looked in detail, but what'd be the specs/API be required for a service?

I guess a good approach would be to look at what you need for each element of the bunch in geopandas, look at what we need in ipyleaflet, and merge the results.

What we need in ipyleaflet is:

  • url
  • name
  • attribution
  • min_zoom (optional)
  • max_zoom (optional)

Also some urls contain a % that we replace by the date (e.g. when the satellite images were taken) in our Python API: https://github.com/jupyter-widgets/ipyleaflet/blob/master/ipyleaflet/leaflet.py#L46

Initially, we could start with the list of leaflet providers we have in ipyleaflet/contextily (I think right now it's the same)

I feel like there is a bit more in contextily, but again merging both would be the best approach I guess.

@jorisvandenbossche
Copy link
Member

All those listed keys are available in the providers defined here in contextily (again, the dict structure is actually based on ipyleaflet ;), and all values are obtained from scraping leaflet-providers https://github.com/leaflet-extras/leaflet-providers).
Have a look at https://github.com/geopandas/contextily/blob/master/contextily/_providers.py to see if that looks OK. I think url/name/attribution should be present for each of the providers

One potential problem is the attribution: in contextily we stripped them from all html code (https://github.com/geopandas/contextily/blob/master/scripts/parse_leaflet_providers.py#L121), while keeping this probably makes sense for ipyleaflet.
Now, it should not be a problem to preserve that in the original ones, and then we can do the stripping in contextily (or add a attribution_text key with the stripped version that contextily can use).

For the date, this is done with a /{time}/ in the url (at least for the NASA example you gave), so the url % day in ipyleaflet could be replaced with url.format(time=day), I think (or actually updating the provider with provider = provider(time=day), if it has a time key).
Would that work for ipyleaflet?

So if ipyleaflet would happy to use this, I think it would basically entail:

  • move the contextily/_tile_providers.py + the script to generate it to a new package almost as is
  • add the few ones that ipyleaflet has and contextily not (from a quick look, I think this is only the Strava ones). For this, we need to implement a bit of code to merge the autogenerated ones with a set of manually added ones.
@davidbrochart
Copy link

davidbrochart commented May 30, 2020

For the date, this is done with a /{time}/ in the url (at least for the NASA example you gave), so the url % day in ipyleaflet could be replaced with url.format(time=day), I think (or actually updating the provider with provider = provider(time=day), if it has a time key).
Would that work for ipyleaflet?

Yes, in any case we can update our code it there is any change in how we insert the date.

So if ipyleaflet would happy to use this, I think it would basically entail:

  • move the contextily/_tile_providers.py + the script to generate it to a new package almost as is

  • add the few ones that ipyleaflet has and contextily not (from a quick look, I think this is only the Strava ones). For this, we need to implement a bit of code to merge the autogenerated ones with a set of manually added ones.

Let's get started with this.

@darribas
Copy link
Collaborator Author

This is great! I'm not sure when exactly I'll have bandwidth to get started on this (summer fast approaching?), but if anyone can, please do go ahead.

In the meantime, it might help discussing a few things (aka a few random thoughts about going forward):

  • Where'd we host this? Would the geopandas/jupyter/other orgs be a good home for the project? A separate one?
  • Besides the coding aspect, it'd be really cool to get a clear spec of how this library would present sources, so all the libraries involved can consume them directly and we propose a "standard" that future projects using XYZ sources could adopt
  • It'd also be nice to have a contribution structure that allowed us:
    1. To ingest automatically the Leaflet list (which I think all the projects here use currently)
    2. To manually add independent sources. For this, in the documentation, it'd be cool to lay out clearly how folks can contribute more sources as a PR and how we can "check they work" (?) in a streamlined process
    3. To automatically process all these sources and present them to the user with a single interface that is seamless and agnostic to how the source/provider was "ingested" in the library
  • And, of course, a catchy, witty name :-)
@davidbrochart
Copy link

  • Where'd we host this? Would the geopandas/jupyter/other orgs be a good home for the project? A separate one?

I think hosting it in geopandas makes more sense than in jupyter.

@davidbrochart
Copy link

It looks like we are all too busy to work on this. Should we drop the idea or keep it open? We'd like to move forward with jupyter-widgets/ipyleaflet#741.

@darribas
Copy link
Collaborator Author

Hello, thanks for the post and checking in. I'm very low on bandwith atm (term time in corona days does not leave much coding time...) and would not want to delay any progress on your end. If any of those ideas could be developed outside ipyleaflet so we can consume in other libs like contextily, it'd be fantastic. I can help on that front but leading on it is unlikely.

In reading #741 it occurred to me that a start to centralise basemaps could be a JSON file that could be imported on ipyleaflet/xleaflet and that we can also work with in contextily. Over time we can work on tooling to automate the generation of that file and how we can take further sources from places other than leaflet.

I'm sorry I can't commit to more right now, I'm just over-stretched.

@davidbrochart
Copy link

No worries @darribas!
So yes we could start by simply implementing jupyter-widgets/ipyleaflet#741 in a separate repository, and build up from there. Thanks!

@martinfleis
Copy link
Member

A quick update on this. @darribas and I plan to do a coding sprint on this by the end of the next month, starting from the current contextily._providers module and including features like #172 and #171.

One thing I'd like to understand before that is the best way of storing the basemap metadata. At the moment, we have them hard-coded in a (generated) Bunch object

providers = Bunch(
OpenStreetMap = Bunch(
Mapnik = TileProvider(
url = 'https://{s}.tile.openstreetmap.org/{z}/{x}/{y}.png',
max_zoom = 19,
attribution = '(C) OpenStreetMap contributors',
name = 'OpenStreetMap.Mapnik'
),

However, reading through jupyter-widgets/ipyleaflet#741 and the discussion above, one option which came to my mind is to store them as JSON and create the Bunch on initialization (should be lightweight script anyway). In that way, we still have the Python object with completion in Jupyter and generic JSON anyone can ingest and use, even without Python in their environment (i.e. linking it directly from repo).

Also, if there are some updates in ipyleaflet or elsewhere that should be taken into the account, let us know.

@davidbrochart
Copy link

store them as JSON and create the Bunch on initialization

That would be great.

@martinRenou
Copy link

One idea brought by @SylvainCorlay in jupyter-widgets/ipyleaflet#741 was that we could install this JSON file under /share

To answer to your question in this issue @jorisvandenbossche:

JSON data from a share/jupyter/leaflet/basemaps directory...

To understand better, what would be the exact idea of the above? Is this something that already exists from the leaflet installation, or is it something that ipyleaflet (or another package) would generate and install, so it is usable by ipyleaflet itself and other packages like xleaflet?

It's something that this xyzervices would install, so maybe we could install this JSON file under share/xyzervices? This way the JSON file is discoverable to other libraries, and it also allows users to update this JSON file or provide their own.

@martinfleis
Copy link
Member

I like this idea (not that I know how to do that atm). To use the JSON you just need to ensure that xyzservices is installed but no need to import it or fetch it from the web.

@martinfleis
Copy link
Member

The work on the package has begun at https://github.com/geopandas/xyzservices. Please express any wishes we did not cover yet (either in code or in issue) as an issue in the repo. At the moment, we store providers in JSON and mimic the behaviour of contextily.providers module.

@martinfleis
Copy link
Member

We're moving to the next phase with this. xyzservices has been released! https://xyzservices.readthedocs.io/en/latest/

During the following days, I'll try to open issues in respective packages to start a discussion on how to use it in the most efficient way. Thanks a lot for this discussion - now let's move it to https://github.com/geopandas/xyzservices. 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
5 participants