Vocab for Readers

Python / Django

What is it?

I come across new words often — reading the news, books, or even doing crossword puzzles — and am often frustrated to find that I forget what they mean by the time I see them next. I created Vocab for Readers as a way to build my vocabulary with the words I discover organically.

Each time you add a word, you’ll also include the sentence in which the word was found and the source where you found it (e.g. "An article in the NYT about monarch butterflies"). With the Shortcut I built for iOS or Mac, it’s as simple as copying the sentence and following a few prompts. The only information entered manually is the name of the source, so it takes about 15 seconds to add the word with the context and source. See below for details on how the Shortcut works.

Once you’ve added the word, the app instantly retrieves the definition, synonyms, examples, and etymology. It first looks to Webster’s Dictionary API, and then if any of those pieces of word data are not present there, it looks to WordsAPI to fill in the gaps. The vast majority of the time, there is some data for each field between the two sources.

After you have some words in your list, you can begin quizzing yourself. I made the quizzes as simple as possible to minimize friction — for this to work, users need to actually quiz themselves. Each quiz consists of 10-15 words. See below for a more detailed explanation of the logic around building the quiz queue, which takes into account the user’s previous responses for each word, or whether it’s a newly added word.

How it works

Data Model

There are three main tables: Word, VocabEntry, and QuizResponse (also User, though this is somewhat obscured since I’m using default Django user settings). When a user adds a word, the dictionary data is stored in the Word table (if it’s not present in the table already). A VocabEntry is also created, linked to the Word and to the User who added it. The VocabEntry also includes the discovery context and discovery source, since those are unique to the user. The user may opt to update their vocab entry with their own definition, synonyms, etc, which are also stored in the VocabEntry (the override fields are empty by default). When a user takes a quiz, a QuizResponse is stored for each VocabEntry in the quiz with correct_answer set to True or False. The QuizResponse is also linked to the User. I’ve simplified the User table below for illustrative purposes.

https://www.notion.so/image/https%3A%2F%2Fprod-files-secure.s3.us-west-2.amazonaws.com%2F78cf2b16-8db5-48b7-a774-101f060e9da2%2F06888f59-8a8b-428b-acc0-0fdc74ed0541%2Fvocab_for_readers_db_diagram.png?id=15cf7389-bc8e-4c9d-9ad6-9cf33423641f&table=block

The Shortcut

I do most of my reading on my phone (iPhone) or computer (Mac). Using Apple Shortcuts was a simple way for me to add words efficiently as I read.

iPhone

On an iPhone, you can install the shortcut and then add it to your home screen in just a few taps:

When you click that link, you'll see the "Download" option at the bottom of your screen. Tap it.
Once it's downloaded, you'll then see "Open in..." at the bottom of your screen. Tap it and then select Shortcuts.
Tap the three-dot menu for the shortcut within the Shortcuts app
Tap the Share icon at the bottom
Tap “Add to Home Screen”

To add a word, just highlight the full sentence that the word is in, copy the sentence, and then go to your home screen and tap the Vocab For Readers shortcut icon. That will open a dialog window where you can type the source (e.g. the article / book where you found the word) and select whether you copied the word in context, just the word, or nothing at all. The shortcut then opens your browser to the Add a Word page with fields pre-filled.

If you copied the word in context, the Discovery Source and Discovery Context fields will be pre-filled. You’ll then be able to select the word from a drop-down menu (an alphabetized list of all the words in Discovery Context). This is what I do most of the time — I think having the full discovery context is helpful for retention.

If you copied just the word itself, that will be pre-filled in the Word field, and Discovery Source will also be pre-filled.

If you didn’t copy anything (if you’re reading a physical book, for example), only the Discovery Source will be pre-filled, and you can enter the Discovery Context and Word manually.

Then, just submit the form to add the word to your vocab list. If you want to edit the definition, synonyms, etc, just click the “Update Word Data” link to edit.

Mac

On a Mac, download the shortcut and then open it — that will open the Mac Shortcuts app. Click "Add Shortcut" in the Shortcuts app and you’re all set. Adding a word is the same as on iPhone, except instead of tapping the icon on the home screen, you’ll click the Shortcuts icon on your menu bar at the top of your screen and then select Vocab for Readers.

Word Data Retrieval

When a user adds a word, the app calls add_vocabentry.py. If the word doesn’t already exist in the Word table, it retrieves the definition data by calling build_dictionary_data.py. This script first calls the Webster’s Dictionary API to get the root word, or lemma (e.g. if the user enters “reading” the lemma would be “read”). That is what will be stored in the Word table (and any VocabEntry linked to the Word).

It then parses the Webster API’s highly complex JSON response. The bulk of the script is a function called extract_webster_data(). Here, I find every possible piece of data that might be useful for the user. The unpredictability of the JSON response for any given word meant that I needed to account for several possible structures. For example, some senses of a word might be structured as a “parenthesized sense sequence” (”pseq”) that contains a sequence of definition arrays within the main definition array. Some might have a “divided sense” (“sdsense”) which contains a “sense divider string” like “specifically” along with another portion of the definition. Some might have a “called also note” or “usage note”. All of these parts of a word’s data, and more, can be nested within other arrays.

I also needed to handle (and remove) tags that instruct how a word should be displayed, and number the “senses” and “subsenses” of the word in a way that allows the user to easily see which synonyms and examples correspond to each definition of the word.

Handling this data structure in a way that assures an organized, readable format for the user was one of the most challenging (and eventually, satisfying) parts of this project.

After the Webster data is retrieved, if no word data is found for any particular piece of word data (or if the word is not found at all), build_dictionary_data.py then calls the WordsAPI to fill in the gaps.

Once the word data is set, add_vocabentry.py creates a VocabEntry for the user, linked to the Word. The user is then sent to that VocabEntry’s page to see the data that was retrieved (along with their Discovery Source and Discovery Context), and they can override the word data if they’d like.

Quizzes

If you want to try out the quiz functionality quickly (without having to think of words to add to your vocab list), go to the Upload a Vocab List page and download the template. Then, upload the template on the same page. Once that’s uploaded, those words will appear in your vocab list and quiz.

Quizzing yourself is critical if you actually want to learn the words you add to your vocab list.

Each quiz contains 10-15 words, structured like flashcards. For each word, the user thinks of the definition and usage, and then clicks “Show Word Details.” That expands the word data so the user can see it and decide if they knew it or not. Then, they’ll either mark it Correct or Incorrect and continue to the next word. When they submit the quiz, a QuizResponse is created, linked to the User and VocabEntry.

The words in any given quiz are set in create_quiz_queue.py according to the following logic:

For all previously quizzed words, the script calculates the user’s current response streak (e.g. -3 = 3 Incorrect).

Words with negative streaks are prioritized, sorted by worst streaks and then by most recently quizzed.

Then, words with low positive streaks (1 Correct or 2 Correct) are added, also sorted by worst streak followed by most recently quizzed. This way, you’re being quizzed on a word again for the next two quizzes after you’ve recently broken a negative streak — this helps with retention.

Finally, words with high positive streaks (3 or more Correct) are added, sorted only by least recently quizzed (not by streak). This prevents new words from popping up in quizzes for too long after they're pretty well learned — once you’ve gotten a word correct three times in a row, other words with high positive streaks that haven’t been quizzed as recently are prioritized.

Previously quizzed words are capped at 10 per quiz, so if you don’t have any newly added words that haven’t been quizzed yet, your quiz ends up with exactly 10 words. If you have any un-quizzed words, up to 5 are added to the queue.

On the homepage, I added instructions for creating an iPhone Automation that sends a push notification at a particular time every day linking to the Quiz page. Since I know the quiz will only take a minute or two, I almost always quiz myself when I get the alert on my phone.

The App

Years ago, I built a version of this app using Amazon RDS to host the database, with the word data retrieval scripts in Lambda, triggered by a Shortcut that called an API that I set up in Amazon API Gateway. Becoming familiar with those AWS services was great, but the quizzes were purely local, run in Terminal. It worked well enough for me, but I wanted to make it usable for anyone. When I started learning Django, I decided to return to this project and make it a full-fledged web app.

I decided to use Pythonanywhere to host the app and MySQL database. I set up my computer with a local test database so that I can easily test in browser before deploying. This was my first “real” Django app, and as a self-taught developer, it was incredibly fun to learn as I built it.