Author Topic: Scraping  (Read 2163 times)

Offline robertsala

  • Member
  • ***
  • Posts: 85
  • Karma: 8
  • New Forum User
    • View Profile
« on: December 30, 2014, 08:05:14 pm »
Hi guys!

I'm creating a website in order to sell some products from an authorized wholesaler. I'm using opencart as an eCommerce platform and now the biggest challenge is scraping off all the data; products, prices, categories, description, etc. Basically what is called Dropship.

Things I need to consider:
Import all products from all available data feed
import all categories
Import both short and long product descriptions
Download product images

The wholesaler has an API but I don't see ANYWHERE in opencart's admin panel where to add it. Anyways, I bumped into a site called
It says its a framework to extract the data I need from any website. Have you guys ever heard of this program and do you know of any that you can recommend to me. Thanks in advance!

Website under development is

Offline 10i

  • Trusted User
  • Member
  • *****
  • Posts: 496
  • Karma: 147
  • Peppermint Enthusiast
    • View Profile
    • My Peppermint Blog
  • Peppermint version(s): Peppermint 10 - 64 bit
Re: Scraping
« Reply #1 on: December 31, 2014, 10:48:08 am »
Hi, sorry I don't have any experience with this, but I do wish you the best of luck.
Peppermint user since Fire / Ice

View my Linux blog:

Online VinDSL

  • Administrator
  • Hero
  • *****
  • Posts: 5821
  • Karma: 1122
  • Team Peppermint
    • View Profile
  • Peppermint version(s): Developmental Builds
Re: Scraping
« Reply #2 on: January 03, 2015, 08:48:55 am »
Interesting website!

It looks like they came up with some sort of python framework for scraper sites.  Python would be great for that -- you won't have to write your own libs.

Depending on what you're trying to accomplish, you might not need an industrial strength framework like that.  Sure, if you're sucking data from 100s of websites for news aggregation, or whatevs, then something like that would be indispensable, but...

Personally, for the limited amount amount of scrapping that I do on my websites, I just hard-code my scripts using cURL.  cURL is your friend.  I'm surprised more ppl don't use it.

As an aside, be very careful when you suck data from other websites!  I got a cease and desist order from a corporate legal department, some years ago, along with a demand for $30,000 for scraping a few stock exchange prices off their website, and re-posting it on mine.  Just saying...

Anyway, I'll check into the website you linked more thoroughly later.  Thx  ;)
« Last Edit: January 03, 2015, 09:07:50 am by VinDSL »