Building a Chrome Extension in 2021

Learn what it took to build Trove Agent, a web scraping Chrome Extension inspired by Kimono - a popular extension that was acquired by Palantir and eventually retired.

Starting with an idea

With Trove my vision is to reduce the setup, code and configuration required for scraping data from the web and automating the browser. Many businesses use web scraping and automation to bootstrap, grow, research and stay competitive. To scrape the web at greater scale, I built Trove API which is the backbone for enabling the Trove vision to be fulfilled. On top of this API, I believe no-code and no-configuration products will help democratize data on the web. That is why I decided a Chrome Extension would be the next logical step. The image below shows what Trove Agent looks like.

Trove Agent

I remember using Kimono back in 2014 for pulling data for some spreadsheets. It just worked and got out of the way. Unfortunately they were acquired a few years later: Techcrunch: Palantir Acquires Kimono Labs For Its Web-Scraping Service. Some of the features were great such as their selector engine. The real benefit was that it allowed you to point-and-click and label a few data points to start scraping, no heavy backend or configuration was required.

Setting up

There's quite a few ways to build a Chrome Extension. You could use any frontend framework that works in the browser. Trove Agent is built with React.js and uses gulp + webpack to bundle/pack the extension. You could use create-react-app to build the bundle and simply pack it yourself with any task runner. The more challenging part is running a dev server and iterating on the extension faster. I solved this by running a express server with webpack dev + hot middleware. When running locally, the extension injects the code running on the dev server and not the bundled file. Styles are handled with style-loader imported into the react component and injected into the DOM.

I inject the javascript code into the current window using the chrome.tabs.executeScript API. This is managed by the background.html page which runs in persistent mode to manage some state for the running extension. The extension code is injected in the current browser tab, which sends messages to the background.html page when the user dismisses the extension.

Chrome Extension Persistent

Another method of writing chrome extensions is called pop-ups. This is an html pop-up that appears under your extension icon. An example of this is the postman interceptor extension. This didn't make too much sense for us since the pop-up disappears when you interact with the current tab.

Chrome Extension Popup

The manifest.json

The manifest.json file is required as it describes your extension, asks for permissions and defines how it will work. Trove Agent requests access to all domains, so that you can launch the extension on any website.

Here is what our manifest.json looks like, there isn't anything proprietary in here:

{
"background": { "page": "views/background.html", "persistent": true },
"browser_action": {
"default_title": "Trove Agent - Web Scraping & Automation"
},
"content_security_policy": "default-src 'self' http://localhost:3005 http://127.0.0.1:3005; script-src 'self'; style-src * 'unsafe-inline'; img-src 'self' data:;",
"default_locale": "en",
"description": "With Trove Agent you can scrape & automate the web right in your browser. No code or configuration required.",
"icons": {
"16": "assets/images/logo-16.png",
"48": "assets/images/logo-48.png",
"128": "assets/images/logo-128.png"
},
"manifest_version": 2,
"name": "Trove Agent - Web Scraping & Automation",
"permissions": [
"downloads",
"unlimitedStorage",
"storage",
"http://*/",
"https://*/"
],
"version": "1.0.1",
"web_accessible_resources": ["assets/images/*"]
}
  • background : the background page to load, can be a js script or a html file. Persistent keeps the background page alive.
  • browser_action : loads the extension icon on the toolbar at the top right. Allows the extension to respond to user input such as clicking the icon to enable the extension for the current tab.
  • permissions : sets the extensions permissions on the users browser. What the extension can and cannot do.
  • web_accessible_resources : a list of paths specifying the resources that can be usable in the context of a web page. The extension can load images and other resources from this.

Charging for the extension

Trove Agent isn't free, this is because of the value it can add and because of the time it takes to build an extension like this. In order to accomodate payments, the extension asks you to buy a license from our website using stripe checkout before you can use it. Once a user makes the purchase, the license is displayed and emailed to them. The license is generated on the backend after the stripe checkout succeeds. The user copies the license key and pastes it into the modal when enabling the extension. The extension makes an ajax call to our server to validate the license, once validated the license activation is synced to chrome storage so that the user does not need to keep activating everytime they launch the extension.

Update: Getting rid of the extension paywall

I thought by charging for the extension I could generate some revenue to reinvest into the business. But this approach is wrong. The extension should be used to generate traction for the business, bringing in more users over time. That's why I'm now making the extension free to use, including any updates and bug fixes that are added over time. The paywall had low conversion and did not lead to any meaningful sales or revenue.

The latest update, once approved will have no paywall on the chrome store: version 1.1.0. All customers who purchased the extension will receive a free refund.

Bundling for the Chrome Store

The chrome store requires you to submit a zip file. The structure I used is straight forward:

├── _locales
│   └── en
│   └── messages.json
├── assets
│   ├── images
│   │   ├── logo-128.png
│   │   ├── logo-16.png
│   │   ├── logo-48.png
│   │   ├── icons
│   │   └── trove-logo-short.svg
│   └── js
│   ├── app.min.js
│   ├── app.min.js.LICENSE.txt
│   ├── background.min.js
│   ├── inject.min.js
│   └── inject.min.js.LICENSE.txt
├── manifest.json
└── views
└── background.html

The only mandatory part is the manifest.json at the root. Chrome interprets that and will run your extension as you described. Once you have a bundle (zip file) of your extension, you can submit on the chrome web store - as long as it passes the review process.

The Review

The review process can be slow, for us it was over a week waiting for approval. It can also be strict - we were denied once for having localhost code in the extension (for local development purposes). Don't give up though, make the changes and submit a new build until you get approved.

Wrapping Up

I hope this was helpful, the chrome extension marketplace is large and enables you to reach a big audience quickly. The marketing effort is still up to you though - eventually you will have to rack up reviews for the web store so people start to discover your extension!

I ran into this recently which could of helped me start faster: Chrome Extension Starter Kit. And finally you can check out the extension here: Trove Agent Chome Store.

Trove Earth
The quickest way to Scrape the Web

Spend less time building infrastructure and more time pulling data.

Sign up free