New Pack - Web Scraping with Scrapestack

Hey all,
After evaluating a lot of potential websites manually, I’ve found it’s far easier to juggle the data inside of Coda. As such, I’ve created a new pack that links to the Scrapestack API and returns the html of any website when given an input URL.

From there, it’s a lot easier to bulk parse the html - and even get an AI’s eyes on it.

It pairs really well with @Filipe_Fortes’ fantastic HTML pack, which I recommend.

Would love to hear any initial thoughts/suggestions if anyone is looking to dig deeper into websites.

2 Likes

Hey @Billy_Jackson, could I use this to get a linkedin post if given a url?

I think you probably could, although you might have to parse the raw HTML to extract the info from it. If you want to send me an example of the link you mean I’ll happily put it in a doc and share with you the output

I am quickly summarizing articles from the web using OpenAI (e.g., The CIO Imperative: Is your technology moving fast enough to realize your ambitions? | EY - Global, Six ways the CFO can use artificial intelligence, today | EY - Global). My initial thought is that I could use Scrapestack to obtain the HTML, clean it with the HTML pack and then have OpenAI summarize. However, this formula

Scrapestack::URLScrape([Juliana Kralik], ‘Six ways the CFO can use artificial intelligence, today | EY - Global’)

returns [object Object] as a result. Do you have any suggestions as to where I might be going wrong? Or do you have any suggestions on a more efficient way to accomplish this task?

1 Like

Hi Juliana,

The formula does return an object that then requires parsing, but it sounds like maybe this object is an error.

Could you potentially add me as a collaborator on the document so I could take a look?

When scraping websites for crm data enrichment, it’s crucial to respect the terms of service of the website and adhere to ethical guidelines. Some websites explicitly prohibit scraping in their terms of use, so make sure you have the right to scrape the data you’re working with.

Hi @Billy_Jackson were you able to help clarify this problem? I ran into the same error upon referencing the URL from another column in the table.

Hi Aziz,
I can’t remember how we solved that previous problem, but if you can share your doc with me I can probably help you out!