This is the project i've used to demo fuzzy searching and web scraping in the HSPβ’PESUECC Project Expo β€οΈ
Hey π This is a CLI tool utilising API's for retrieving user-requested xkcd comics. Its a relatively small sized project, which is WIP cuz of a lack of data. This project is somewhat of a playground for me to explore different searching and querying techniques.
Due to data limitation, I wanted to make it a goal to make it super easy to find a specific comic based on query. The roadmap of this current is to make a smart cli tool to find the most relevant comic based on a search query.
-
Clone this repository
git clone https://github.com/bwaklog/xkcd-grab
-
Install requirements Some pre-requisites
- Python3+
- tesseract OCR engine that is going to be implemented in the future updates
./install.sh
-
Add xkcd alias to the path for easier commands Add alias to the path manually, I still have to figure out how to automate this.
alias xkcd='./xkcd.sh'
Sidenote : the script creates a virtual env
venv
, so you might want to start using it. venv/bin/activate
PS even if u mess up the commands, there is a help file to guide you...which I am yet to complete :P
Here is a boiler plate of how the CLI commands must follow
xkcd <type of request> <extra commands>
type of request | flags | |
---|---|---|
β | latest comic | -l, --latest |
β | specific comic number | integer ex: 297 |
β | Fuzzy search comic titles | -f, --fuzzy |
β | Regular searching something like an SQL syntax | -s, --search |
β | Web Scraping search using google's searching algo to find the best result | -g, --google |
type of request | flags | |
---|---|---|
β | quick look comic - uses system quicklook on MacOS to display comics - uses system default app on other platforms to display comics |
-q, --ql |
π | Saving Comics Feature (Currently can be done by saving image opened by quicklook) | TBA |
π | Sharing Comics Feature (Currently can be done by saving image opened by quicklook) | TBA |
- For MacOS systems, image is opened using the system quicklook. This has been done by utilising the
qlmanage
command - Web Scraping uses googles best matches to find the comic you are searching for. All you need to type is a search query(anything that describes the comic)
<iframe src="https://i.imgur.com/xCOmCyX.mp4" allow="fullscreen" allowfullscreen="" style="height: 100%; width: 100%; aspect-ratio: 16 / 9;"></iframe>
This project is somewhat of a playground for me to explore different searching algorithms and querying techniques. While this might have a niche target, I want to build this tool into a more robust API client. The roadmap of this current is to make a smart cli tool to find the most relevant comic based on a search query.
The current web scraping
function that is built into the app is the goal I am trying to achieve using data from all the 2800+ comics alone. So this is still very much a work in progress
-
Create a web interface using flask...
May or may not go ahead with this option cuz the main goal was to create a cli tool. But if needed, I take a chance in making one.- πΎ Local Storage options for comics
- β€οΈ Creating Bookmark/Liking features
- π© Creating a sharing option. Send your favorite comics to your friends with a few clicks!
- Umm...A neat interface cuz I don't want get myself using tkinter or some other boring looking tool.
-
There was supposed to be an
install.sh
script to add thexkcd.sh
script to your alias but that didn't seem to work cuz idk how to do that
γ | Feature | Progress |
---|---|---|
π₯ | Smart Comic Search | πΊ In progress |
πΎ | Local Storage Option | π workaround available |
β€οΈ | Liking\Bookmarking option to save comic no and not on local storage | π |
π© | Sharing feature (undecided) | WAP |
π€ | Flask generated page | TBA |
β οΈ This is very much in devlopment, but here is how you can use the little orca-mini LLM to make the cli expaliln the comic
- Install ollama
- Install orca-mini's LLM using ollama (about 2.0 GB)
ollama pull orca-mini
# if ur familiar with docker, you know whats going on
# also macos and linux only for now i guess (26th Nov)
- Start the server in another temrinal window
ollama serve
- Use the flag
-e
or--explain
after fuzzy search, or web scraping for it to start generating after getting the results
xkcd -f -e
What i'm using for this program:
- This isn't really a disclaimer but if you don't have quick-look (MacOS only), that's no problem! But for now all you get is:
- π A link to the image of the post. You can open it in your default browser
- A very very very descriptive info of the post you requested for γ
- Yeah, I haven't used this on a windows pc so far, and some of these...most of these commands are UNIX commands so, join the the Force with BASH πΊ
- running this in a venv for development, so do make sure you install all the requirements from
requirements.txt
- That's it for now...nothing else to force you to install..other than python3π
- ollama and orca-mini for party feature