Description as a Tweet:

First-of-its-kind intelligent thesaurus that uses AI (GPT-3) to generate suitable synonyms for words and phrases based on the context. In contrast to traditional thesauruses, Lexicon.ai ensures the alternatives always make sense to save you time when improving your writing.

Inspiration:

Over the start of this school year, I’ve noticed how several of my high school peers, myself included, struggle with finding synonyms or alternatives to phrases that actually make sense in the specific context. This issue arises whenever you’re writing, for instance, an essay for English, or a research paper for History, and you want to make your writing sound more sophisticated. Traditionally, you look up synonyms for the word or phrase you want to change in an (online) thesaurus, but you soon discover that you either don’t know the majority of the words, or that they completely make no sense in the context of your sentence. Not only this, but it can be especially difficult for non-native English speakers to learn or use new words when referring to traditional thesauruses since they can't distinguish between words like "find" and "locate", which don't work in the same contexts, yet they are marked as practically identical in traditional thesauruses which becomes very misleading. Therefore, I built Lexicon.ai, a 21st Century intelligent thesaurus that solves this issue by considering your context and sentence as a whole to suggest alternatives that work and make sense, completely revolutionizing today’s standards of thesauruses and word-finders.

What it does:

Once you reach the site, you’ll be prompted with a text input-field where you can enter a sentence or portion of a sentence, and wrap the word or phrase you’re looking to change with quotation marks (e.i. Her smile makes her very “likeable”). Subsequently, you can click on the “Generate Alternatives” button or simply the Enter Key and Lexicon.ai will begin analyzing your prompt and searching for suitable alternatives that are guaranteed to make sense in that context. Within 3 seconds, Lexicon.ai will output 3 alternatives to your sentence where the highlighted portion of your sentence has been swapped with another fitting word or phrase. If you don’t like the results, you can enter the same input again and you will always receive unique results every time. Finally, you can click on whichever alternative you prefer the most to copy it to your clipboard and paste it back into your document. Voilà; you’ve just found working synonyms to your phrase or word that are 100% accurate in under 3 seconds, an impossible feat with sites like Thesaurus.com!

How we built it:

Since the project predominantly relies on the AI, I designed example prompts to be passed on to the AI when a request is made so that it understands the content and format of the output; it was essentially several example prompts that looked like this:
“Input: I found his talk very “insightful”
Output: I found his talk very |enlightening| * I found his talk very |illuminating| * I found his talk very |eye-opening|”
You will notice that there are vertical lines and asterisks used in the output; this is useful for formatting and presenting the AI output later which is explained below.
The core of this project is Node and Express which run the actual server, and there are 3 fundamental components, namely the HTML, CSS, and JavaScript. With the first 2 being relatively straightforward, the JavaScript handles requests to the OpenAI API once the Event Listener for the “Generate Alternatives” button or Enter Key has been triggered. The new user input is appended to the pre-designed prompt examples mentioned earlier, and a request is made to the Davinci model of GPT-3, the most sophisticated one, and passes the combined data (pre-designed prompt examples + new user input). The output, along with other pieces of information from the API like certainty %, is then received in JSON format, from which purely the predicted output is extracted. It then gets passed into a formatting function where each of the vertical lines gets converted into either an opening or closing bold HTML tag (< b> or < /b>) which later makes the changes made to the original sentence in each new sentence render as bold text to stand out, and the asterisks are used to truncate the entire output into the 3 separate sentences generated. In particular, this would convert the example prompt above into: “I found his talk very enlightening”, “I found his talk very illuminating”, “I found his talk very eye-opening”, where each of these sentences is an individual element in the same array. These are then ultimately displayed individually on the front-end, and each of them are assigned an “onclick” property which calls a copy function that copies the innerText of the element (basically the newly produced sentence) to the clipboard. Another part worth mentioning is that, to ensure I could use the OpenAI Node.js module to actually make requests to the API in the browser, I used a tool called Browserify which creates new, bundled versions of the JS files which allow them to be compiled in the browser.

Technologies we used:

  • HTML/CSS
  • Javascript
  • Node.js
  • Express
  • AI/Machine Learning

Challenges we ran into:

There was nothing particularly challenging with the actual code, but rather making the AI produce consistent results; this meant I had to test a multitude of inputs to determine what types of recurring mistakes the AI was committing in order to create additional training data and prompt-examples for it to re-learn. For instance, the 3 most frequent mistakes the AI made was 1. Only providing synonyms for 1 word if many consecutive words were selected, 2. Generating more than 3 alternatives, and 3. Repeating and counting the user input as one of the new alternatives. This continuous “guess and check” fixing process was quite a hassle, yet it nevertheless taught me many valuable lessons for when curating training data for any AI and improved my understanding of how typical AI models function. On the other hand, other than the prior mentioned issues, the AI always managed to find working synonyms which was great to see.

Accomplishments we're proud of:

I’m most proud of managing to single-handedly finish the project in less than 24 hours (as I have a flight early morning the next day), and being able to successfully and efficiently debug all the JavaScript. Furthermore, in this project I attempted to develop components and actions (like the notification that pops up and says “Copied.”, aka. Toast Notification) that I’ve never done before, and I am overjoyed that I was able to develop them without any previous experience and just with trial and error. Lastly, I’m obviously delighted that I’ve made a tool that nobody has ever created before which is simply revolutionary and I hope Lexicon.ai can be the start of a snowball effect that eventually leads to many more similar tools that make certain tasks easier with the use of AI.

What we've learned:

To reiterate what I already said, I’ve discovered in much more depth the nature of AI models and how or why they might act in a peculiar way in certain situations. Relevant to AI, I’ve also significantly improved my training data/prompt design skills and my output debugging, as well as being able to suggest useful new data to fix mistakes the AI model is repeatedly committing. In addition, this is my first time ever using Node.js and Express, so I’ve gained a grasp of these frameworks’ general uses.

What's next:

From here, the next stepping stones in line are definitely the following:
- Employing multi-selection feature; user can select several separate parts of a sentence to generate synonyms simultaneously for

- Constantly ameliorating accuracy and reducing the number of recurring mistakes the AI makes

- Allowing users to select how up to how many alternative sentences they want generated

- A feature to select the style of the alternatives, like formal, colloquial, maybe even Legal English, etc.

- Giving users the option to upload files AND providing links to websites to be webscraped → to generate entirely paraphrased versions of the content

- Filtering length of alternatives

- Initiating Beta testing

- Deploying the actual website and releasing it to the general public

Built with:

I used HTML, CSS, and JavaScript with Node.js and Express.js, the Davinci GPT-3 Model from OpenAI’s API, and Browserify. The entire project was developed on my Mac Mini in my room.

Prizes we're going for:

  • Best Documentation
  • Best Venture Pitch
  • Best Web Hack
  • Best Machine Learning Hack

Team Members

Steve Mendeleev

Table Number

Table TBD