Description as a Tweet:

Identifying counterfeit products and preventing the sale of them on various e-commerce sites with natural language processing and neural networks.

Inspiration:

In the midst of a global pandemic, more and more people are shopping online as opposed to visiting physical stores. Projected to be a 4.2 trillion dollar industry this year, e-commerce sites are facing record high counterfeit listings. In fact, pre-pandemic trends indicate that one in five products sold on e-commerce sites is counterfeit and that over 71 million dollars have been lost to fraudulent listings. Having realized the negative effects to not only individual consumers and store owners, but to the world’s economy as a whole, we were inspired to create ShopSafe.

What it does:

ShopSafe is a browser extension that, when enabled, analyzes products you are shopping for on an e-commerce site and determines the likelihood that the item is a counterfeit. Trained with thousands of consumer review-based data, ShopSafe analyzes product reviews using sentiment analysis with natural language processing and classifies them into two categories: Real and Fake. After analyzing every review, the percentage of fake reviews is compared to the percentage of real reviews. The exceeding percentage is used to determine the authenticity of the product.

How we built it:

ShopSafe was built using Python, JavaScript, Tensorflow, and Flask. We built the model that analyzes product reviews with Tensorflow. In order to train the network, we used a dataset that contained customer reviews from various websites. The two categories of the dataset, Real and Fake, represent real consumer reviews and fake positive customer reviews. We split the data into training, validation, and testing data using a 75/5/20 split. We built a four a layer neural network in order to classify reviews. The training phase for the model consisted of ten epochs, with each epoch having around a 76% binary accuracy. After training the model, we tested it against our test dataset and recorded the final accuracy to be 65.05%. The consumer-facing portion of ShopSafe was built with JavaScript to create the Chrome Extension. The extension scrapes reviews from the product in question and sends them to the backend Flask server. The flask server manages requests and runs the NLP model to determine the legitimacy of each of the reviews. After further processing, these values are then sent back to the extension to inject a javascript alert into the e-commerce site that includes the processed data.

Technologies we used:

  • Javascript
  • Python
  • Flask
  • AI/Machine Learning

Challenges we ran into:

One major challenge we faced was finding a suitable dataset to train our neural network. Initially, we tried searching for datasets that classified ecommerce products as real or fake. After searching for hours, we were unable to find a dataset that met those standards. This made us rethink our detection process for counterfeit products. With a little brainstorming, we determined another way to identify a counterfeit ecommerce product would be by analyzing the legitimacy of the product’s reviews. Furthermore, we realized that finding a consumer reviews dataset would be considerably easier than finding the dataset we were originally looking for. Another challenge we faced was creating the extension and linking it to the script that scrapes and feeds reviews into the model. We found this challenging because this was our first time creating a browser extension. The documentation and community examples were sparse making this portion of the project a greater challenge than we had anticipated.

Accomplishments we're proud of:

We are proud of the sentiment analysis model that we created. Having never had created a model before, we built a complete natural language processing model that trains and tests reviews in less than five seconds. We are also proud of merging the model into a lightweight chrome extension which hides the technical complexity to the end-user.

What we've learned:

During the development phase of this project, we learned about natural language processing and building sentiment analysis models with Tensorflow. Furthermore, designing the chrome extension and the backend for the extension allowed us to learn more about JavaScript and Flask.

What's next:

In the future we hope to expand upon the number of factors taken into consideration by the machine learning model. This will allow us to improve the accuracy of classifying products as real or fake. These factors could include analyzing the seller, the shipping origin, and the number of verified purchases. We would also like to expand the functionality of our Chrome Extension to other e-commerce sites and improve our classification heuristic. Another goal for this project would be to create a proper UI to help please the user’s eye.

Built with:

Python, JavaScript, Tensorflow, and Flask.

Prizes we're going for:

  • Best Finance Hack
  • Best Documentation
  • Best Venture Pitch
  • Best Web Hack
  • Best Machine Learning Hack

Team Members

Aravind Natchiappan
Rahul Ravindranathan

Table Number

Table TBD