Skip to content

IRI and URI library with full RFC 3986 / 3987 / 6874 / 8141 compliance

License

Notifications You must be signed in to change notification settings

handrews/iriuri

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The iriuri Python Package

WARNING: This package is not yet ready for use!**

The iriuri package will offer both generic and scheme-specific IRI and URI parsing, validation, manipulation, and serialization.

It will also be able to serve as an Apache 2.0-licensed drop-in replacement for at least the basic parsing/composing/resolving features of the GPL'd rfc3987 library (which I highly recommend if you are able to comply with the conditions of GPL).

Release timeline

This package is being developed by @handrews. If you would like for me to have more time for open source work, I would be grateful for any sponsorship.

Features

Initial functionality will target generic IRI/URI syntax, with scheme-specific features for http, https, and urn, as well as media-type-based fragment syntax for HTML, XML, YAML, JSON Schema, and OpenAPI and application/x-www-form-urlencoded query strings being prioritized next.

This libraries eventual goals include:

  • Round-trip-safe parsing and serialization
    • correct handling of empty string vs null components
    • preservation of case for case-insensitive components
    • no unexpected automatic encoding or normalization
  • Proper handling of base URIs/IRIs and URI/IRI-references
  • Full compliance with the ABNF from appropriate standards, including:
    • RFC 3986 Uniform Resource Identifier (URI): Generic Syntax
    • RFC 3987 Internationalized Resource Identifiers (IRIs)
    • RFC 6874 Representing IPv6 Zone Identifiers in Address Literals and URIs
    • RFC 8141 Uniform Resource Names (URNs)
    • additional scheme-/query-/fragment-/urn-namespace-specific standards TBD
  • Flexible options to balance performance and correctness
  • Extensible parsing, validation, normalization, defaults, and comparisons for specific URI/IRI subsets
    • Scheme-specific for authority/query syntax
    • Query format-specific for query syntax
    • Media type-specific for fragment syntax
    • URN namespace-specific for namespace-specific string syntax
  • A comprehensive test suite
    • Scheme-specific tests where relevant
    • Please file an issue if you know of public IRI test data!
    • CJK, right-to-left, vertical, and other display challenges
    • While not a display library, there is a gap in good IRI test data
    • Likely to include WHATWG's URL tests, although this library will comply with IETF standards; whether WHATWG support will ever be considered depends on the degree of divergence involved

About

IRI and URI library with full RFC 3986 / 3987 / 6874 / 8141 compliance

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Languages