Skip to content

Commit

Permalink
Update crawl_id.py
Browse files Browse the repository at this point in the history
Adding some sources.
  • Loading branch information
mahalisyarifuddin committed Nov 9, 2019
1 parent 4c6cb38 commit 8ccc08b
Showing 1 changed file with 7 additions and 1 deletion.
8 changes: 7 additions & 1 deletion Lib/corpuscrawler/crawl_id.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,5 +19,11 @@

def crawl(crawler):
out = crawler.get_output(language='id')
crawl_udhr(crawler, out, filename='udhr_ind.txt')
crawler.crawl_abc_net_au(out, program_id='indonesian')
crawler.crawl_voice_of_america(out, host='voaindonesia.com')
crawl_bbc_news(crawler, out, urlprefix='/indonesia/')
crawl_deutsche_welle(crawler, out, prefix='/id/')
crawl_udhr(crawler, out, filename='udhr_ind.txt')
crawl_bibleis(crawler, out, bible='INDASV')
crawl_bibleis(crawler, out, bible='INDWBT')
crawl_bibleis(crawler, out, bible='INDSHV')

0 comments on commit 8ccc08b

Please sign in to comment.