Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keyword Analysis #70

Open
mpbunch opened this issue Apr 14, 2021 · 2 comments
Open

Keyword Analysis #70

mpbunch opened this issue Apr 14, 2021 · 2 comments
Assignees

Comments

@mpbunch
Copy link

mpbunch commented Apr 14, 2021

Describe the bug
Words on keyword analysis list seem to have trailing characters missing.
Not always, but frequently enough that I noticed it.

To Reproduce
Steps to reproduce the behavior:

  1. Run python/cli code
  2. Navigate to keyword section of html output
  3. Observe
@sethblack sethblack self-assigned this Apr 17, 2021
@sethblack
Copy link
Owner

Heyo! Correct, this is an artifact of stemming and lemmatization of the keywords. I added a "dumb" stemmer lookup that keeps a dictionary of the first occurrence of each word and maps it back to the stemmed version. After the analysis is complete the first occurrence word is the word you'll see in the keyword report - for example, you'll see the human-readable word "glasses" instead of some horrible internal mix of letters the stemmer came up with.

I should definitely add some documentation explaining this so people aren't caught off-guard.

@Getseowebsite
Copy link

How to Navigate to keyword section of html output?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
3 participants