Skip to content

Pull requests: openzim/warc2zim

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Wabac fuzzy rules - update + process
#335 by benoit74 was merged Jun 27, 2024 Loading…
Add option to specify content header length
#321 by benoit74 was merged Jun 18, 2024 Loading…
Add option to ignore charsets found automatically
#319 by benoit74 was merged Jun 18, 2024 Loading…
Fix detection of encoding again
#314 by benoit74 was merged Jun 17, 2024 Loading…
Upgrade deps + fix rewrite mode
#310 by benoit74 was merged Jun 13, 2024 Loading… 2.0.1
Detect content type based on WARC-Resource-Type
#306 by benoit74 was merged Jun 13, 2024 Loading…
Set correct charset in HTML documents
#303 by benoit74 was merged Jun 11, 2024 Loading…
Use same automatic encoding detection for all contents enhancement New feature or request
#302 by benoit74 was merged Jun 11, 2024 Loading…
Fix handling of <script> type in HTML documents
#292 by benoit74 was merged Jun 3, 2024 Loading…
Fix fuzzy rule for Youtube thumbnails in JS
#285 by benoit74 was merged May 30, 2024 Loading… 2.0.0
Properly detect nested redirection loops
#282 by benoit74 was merged May 30, 2024 Loading…
Merge warc2zim2 branch into main
#280 by benoit74 was merged May 24, 2024 Loading…
Do not rewrite URLs composed of just a fragment
#279 by benoit74 was merged May 24, 2024 Loading…
Avoid and detect redirection loops
#278 by benoit74 was merged May 24, 2024 Loading…
Fix youtube thumbnails
#274 by benoit74 was merged May 24, 2024 Loading…
Add support for onxxx HTML events
#270 by benoit74 was merged May 24, 2024 Loading…
Document scraper capabilities and limitations
#269 by benoit74 was merged May 24, 2024 Loading…
ProTip! Mix and match filters to narrow down what you’re looking for.