Mango Scraper — Extract Mango data
Mango scraper powered by Scrapy and Selenium. Run full crawls or single product.
Details
- Demo: https://shop.mango.com/co/
- Country: Colombia
- Status: ✅ Production
- Python: stylos/spiders/mango.py
- Extractor: stylos/extractors/mango_extractor.py
- Lines of code: 416
- Domains: shop.mango.com
Implemented Features
- Footer navigation: categories from footer links
- Categories: Women and Men with full navigation
- Advanced extraction: products, prices, descriptions, images
- Images per color (max 15 per color) with deduplication
- Progressive scrolling up to 30 attempts
- Integrated Selenium with anti-detection
- Pricing system with discount detection
Technical Capabilities
scrapy crawl mango # Full crawl
scrapy crawl mango -a url="URL" # Single product
scrapy crawl mango -o products.json # Export results
Extracted Data
- Normalized product name
- Full description
- Original and current price
- Discount percentage and amount
- Automatically detected currency (COP)
- Canonical product URL
- Images organized by color with duplicate detection
- Extraction metadata (date, site)
FAQ
How to run the Mango scraper?⌄
Use scrapy: "scrapy crawl mango" or for a single product: "scrapy crawl mango -a url="URL"".
What data does the Mango scraper extract?⌄
Name, description, original/current price, currency (COP), canonical URL, color images and metadata.
Docker Compose is up, can I run scraping with a script that talks to the API?⌄
Yes. Use the control_scraper.py script (it talks to the API to orchestrate scraping). Example (full run): python control_scraper.py --spider mango
Also looking for Zara scraper?