As e-commerce continues to dominate retail in 2026, platforms like Amazon host millions of customer reviews that offer invaluable insights into product performance, consumer preferences, and market trends. Web scraping Amazon reviews with Python enables businesses, researchers, and developers to automate data collection, transforming raw feedback into actionable insights. This guide provides a comprehensive, step-by-step approach to building a robust Amazon review scraper.

Whether you’re new to web scraping reviews in Python or looking to optimize your existing tools, this article covers everything from setup to advanced integrations.

Why Scrape Amazon Reviews?

Web scraping reviews with Python isn’t just a technical exercise—it’s a strategic tool for gaining a competitive edge. In 2026, with AI-driven analytics on the rise, scraping Amazon reviews helps companies monitor sentiment, identify pain points, and benchmark against competitors.

Key benefits

● Sentiment Analysis: Aggregate ratings and text to gauge overall customer satisfaction.

● Trend Detection: Spot emerging patterns in feedback over time.

● Product Improvement: Use detailed reviews to refine offerings and boost sales.

● Market Research: Compare reviews across products without manual effort.

By leveraging Python for web scraping reviews, you can process thousands of entries efficiently. However, for large-scale operations, Thordata’s SERP Scraper API shines by providing seamless access to data without the hassle of blocks or bans.

Setting Up Your Python Environment for Web Scraping Reviews

Before diving into code, prepare your environment to ensure smooth web scraping of Amazon reviews with Python. This setup minimizes errors and supports efficient data handling.

Essential Libraries for Python Web Scraping

You’ll need Python 3.8 or higher. Install the following packages via pip:

Code Block Example

text
             pip install requests beautifulsoup4 pandas lxml








● Requests: For sending HTTP requests to Amazon pages.
● BeautifulSoup: To parse HTML and extract review elements.
● Pandas: For structuring scraped data into DataFrames and exporting to CSV.
● lxml: A fast parser for BeautifulSoup.
These libraries form the core of most Python web scraping projects. If you're facing installation issues, refer to Python's official documentation.
Configuring Headers and Proxies
To avoid detection, use custom headers mimicking a real browser. For added reliability, integrate residential proxies early— they offer rotating residential IPs that boost success rates to over 99%, far superior to free options.
Understanding Amazon's Review Page Structure
Amazon's product pages (e.g., https://www.amazon.com/dp/B098FKXT8L) display a limited set of local and global reviews. As of 2026, reviews are typically wrapped in <div data-hook="review"> elements under IDs like #cm-cr-dp-review-list for local reviews and #cm-cr-global-review-list for global ones.

Key elements include:
● Author: .a-profile-name
● Rating: .review-rating > span
● Date: .review-date
● Title and Content: Vary by local/global, often under .review-title and .review-text
● Images: .review-image-tile or similar
● Verified: span.a-size-mini
Inspect the page using browser dev tools to confirm selectors as the structure evolves. For full reviews, consider extending to /product-reviews/ASIN with pagination.




Element


CSS Selector


Description




Author


.a-profile-name


Reviewer's name




Rating


.review-rating > span


Star rating (e.g., "5.0")




Date


.review-date


Review timestamp




Title (Local)


.review-title span:not([class])


Review headline




Content (Local)


.review-text


Full review body




Images


.review-image-tile


Attached photos




This table summarizes core selectors for quick reference in your web scraping reviews Python project.
Step-by-Step Guide to Building an Amazon Review Scraper in Python
Let's build the scraper incrementally.
Sending Requests and Parsing HTML
Start by importing libraries and defining the ASIN (Amazon Standard Identification Number).




Code Block Example





  

  

  Python
import requests
from bs4 import BeautifulSoup
import pandas as pd

asin = "B098FKXT8L"
custom_headers = {
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8",
    "Accept-Language": "en-US,en;q=0.9",
    "Connection": "keep-alive",
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) Gecko/20100101 Firefox/135.0"
}

def get_soup(url):
    response = requests.get(url, headers=custom_headers)
    if response.status_code != 200:
        print("Error in getting webpage")
        return None
    return BeautifulSoup(response.text, "lxml")







This function fetches and parses the page.
Extracting Key Review Data
Define a function to extract details from each review element.




Code Block Example





  

  

  Python
def extract_review(review, is_local=True):
    author_element = review.select_one(".a-profile-name")
    author = author_element.text.strip() if author_element else "Unknown"

    rating_element = review.select_one(".review-rating > span")
    rating = rating_element.text.replace("out of 5 stars", "").strip() if rating_element else "0"

    date_element = review.select_one(".review-date")
    date = date_element.text.strip() if date_element else "Unknown"

    if is_local:
        title = review.select_one(".review-title span:not([class])").text.strip() if review.select_one(".review-title span:not([class])") else ""
        content = ' '.join(review.select_one(".review-text").stripped_strings) if review.select_one(".review-text") else ""
        img_selector = ".review-image-tile"
    else:
        title = review.select_one(".review-title .cr-original-review-content").text.strip() if review.select_one(".review-title .cr-original-review-content") else ""
        content = ' '.join(review.select_one(".review-text .cr-original-review-content").stripped_strings) if review.select_one(".review-text .cr-original-review-content") else ""
        img_selector = ".linkless-review-image-tile"

    verified_element = review.select_one("span.a-size-mini")
    verified = verified_element.text.strip() if verified_element else None

    image_elements = review.select(img_selector)
    images = [img.attrs.get("data-src") for img in image_elements if img.attrs.get("data-src")] if image_elements else None

    return {
        "type": "local" if is_local else "global",
        "author": author,
        "rating": rating,
        "title": title,
        "content": content.replace("Read more", ""),
        "date": date,
        "verified": verified,
        "images": images
    }







This handles variations between local and global reviews.
Handling Local and Global Reviews
Collect reviews with this function:




Code Block Example





  

  

  Python
def get_reviews(soup):
    reviews = []
    local_reviews = soup.select("#cm-cr-dp-review-list > div[data-hook='review']")
    global_reviews = soup.select("#cm-cr-global-review-list > div[data-hook='review']")
    for review in local_reviews:
        reviews.append(extract_review(review, is_local=True))
    for review in global_reviews:
        reviews.append(extract_review(review, is_local=False))
    return reviews







Exporting Scraped Data to CSV
Wrap it up in main:




Code Block Example





  

  

  Python
def main():
    search_url = f"https://www.amazon.com/dp/{asin}"
    soup = get_soup(search_url)
    if soup is None:
        return
    reviews = get_reviews(soup)
    df = pd.DataFrame(reviews)
    df.to_csv(f"reviews_{asin}.csv", index=False)
    print("Data exported to CSV.")

if __name__ == "__main__":
    main()







Run this script to save reviews in a CSV file.
Full Python Code for Scraping Amazon Reviews
Here's the complete, runnable script combining all parts:




Code Block Example





  

  

  Python
import requests
from bs4 import BeautifulSoup
import pandas as pd

asin = "B098FKXT8L"
custom_headers = {
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8",
    "Accept-Language": "en-US,en;q=0.9",
    "Connection": "keep-alive",
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) Gecko/20100101 Firefox/135.0"
}

def get_soup(url):
    response = requests.get(url, headers=custom_headers)
    if response.status_code != 200:
        print("Error in getting webpage")
        return None
    return BeautifulSoup(response.text, "lxml")

def extract_review(review, is_local=True):
    author_element = review.select_one(".a-profile-name")
    author = author_element.text.strip() if author_element else "Unknown"

    rating_element = review.select_one(".review-rating > span")
    rating = rating_element.text.replace("out of 5 stars", "").strip() if rating_element else "0"

    date_element = review.select_one(".review-date")
    date = date_element.text.strip() if date_element else "Unknown"

    if is_local:
        title = review.select_one(".review-title span:not([class])").text.strip() if review.select_one(".review-title span:not([class])") else ""
        content = ' '.join(review.select_one(".review-text").stripped_strings) if review.select_one(".review-text") else ""
        img_selector = ".review-image-tile"
    else:
        title = review.select_one(".review-title .cr-original-review-content").text.strip() if review.select_one(".review-title .cr-original-review-content") else ""
        content = ' '.join(review.select_one(".review-text .cr-original-review-content").stripped_strings) if review.select_one(".review-text .cr-original-review-content") else ""
        img_selector = ".linkless-review-image-tile"

    verified_element = review.select_one("span.a-size-mini")
    verified = verified_element.text.strip() if verified_element else None

    image_elements = review.select(img_selector)
    images = [img.attrs.get("data-src") for img in image_elements if img.attrs.get("data-src")] if image_elements else None

    return {
        "type": "local" if is_local else "global",
        "author": author,
        "rating": rating,
        "title": title,
        "content": content.replace("Read more", ""),
        "date": date,
        "verified": verified,
        "images": images
    }

def get_reviews(soup):
    reviews = []
    local_reviews = soup.select("#cm-cr-dp-review-list > div[data-hook='review']")
    global_reviews = soup.select("#cm-cr-global-review-list > div[data-hook='review']")
    for review in local_reviews:
        reviews.append(extract_review(review, is_local=True))
    for review in global_reviews:
        reviews.append(extract_review(review, is_local=False))
    return reviews

def main():
    search_url = f"https://www.amazon.com/dp/{asin}"
    soup = get_soup(search_url)
    if soup is None:
        return
    reviews = get_reviews(soup)
    df = pd.DataFrame(reviews)
    df.to_csv(f"reviews_{asin}.csv", index=False)
    print("Data exported to CSV.")

if __name__ == "__main__":
    main()







This code is optimized for 2026 Amazon structures, with error handling for robustness.
Enhancing Your Scraper with Thordata's Advanced Tools
While custom Python scripts offer flexibility, scaling web scraping reviews can be challenging. Thordata's Amazon Scraper API addresses this by providing enterprise-grade solutions.
Using Thordata Proxies for Reliable Scraping
For smaller projects, Thordata offers a free trail to get started. They ensure connection stability and basic anti-scraping bypass. For larger scales, Thordata Residential Proxies mimic real user behavior, minimizing blocks—ideal for web scraping Amazon reviews with Python.
Best Practices and Legal Considerations for Web Scraping Reviews
To maintain a positive user experience and comply with standards:
1. Rate-limit requests to avoid overloading servers.
2. Use ethical data practices—don't misuse personal info.
3. Check Amazon's robots.txt and terms; scraping public data is common but consult legal advice.
4. Monitor for changes in page structure.
5. Opt for Thordata to ensure compliant, efficient scraping.
Conclusion
Web scraping Amazon reviews with Python empowers data-driven decisions in 2026's competitive landscape. By following this guide and leveraging Thordata's superior scraping products, you can achieve reliable, scalable results with minimal effort. Start small with the provided code, then scale up with Thordata for professional-grade performance.
Contact us at support@thordata.com for tailored advice.
 

Get started for free


Sign up with Google
 





Frequently asked questions


How to Scrape All Amazon Reviews with Python?
 

To scrape beyond the initial page, add pagination by looping through URLs like https://www.amazon.com/product-reviews/{asin}?pageNumber={page}. Use a while loop until no more reviews are found, and integrate Thordata proxies to handle volume without blocks.



What Are the Best Python Libraries for Web Scraping Amazon Reviews?
 

Top libraries include Requests for HTTP, BeautifulSoup for parsing, Pandas for data export, and Selenium for dynamic content. For advanced needs, combine with Thordata's API to simplify integration.



How to Avoid Blocks When Web Scraping Amazon Reviews with Python?
 

Yes, for public data if you respect robots.txt and ToS—focus on ethical crawling and scraping to avoid blocks, prioritizing proxies and rate limits.






About the author



Jenny Avery
Content Specialist


Jenny is a Content Specialist with a deep passion for digital technology and its impact on business growth. She has an eye for detail and a knack for creatively crafting insightful, results-focused content that educates and inspires. Her expertise lies in helping businesses and individuals navigate the ever-changing digital landscape.



The thordata Blog offers all its content in its original form and solely for informational intent. We do not offer any guarantees regarding the information found on the thordata Blog or any external sites that it may direct you to. It is essential that you seek legal counsel and thoroughly examine the specific terms of service of any website before engaging in any scraping endeavors, or obtain a scraping permit if required.
Learn more about Jenny Avery


        
          
          
          
            
              Looking for
                Top-Tier Residential Proxies?
              Start Free Trial Now
            
            
              您在寻找顶级高质量的住宅代理吗？
              立即开始免费试用


      
        
          
                   
                  
          
          
            
            
              Related Articles
            
            
          
        

        
          
            
                
                  
                    
                  
                  
                    5 Best Etsy Scraper Tools in 2026
                    
                      This article evaluates the top ...                     
                  
                  
                  
                    
                      Yulia Taylor                    
                    
                      2026-02-09
                    
                  
                
                
                
                  
                    
                  
                  
                    What is a Headless Browser? Top 5 Popular Tools
                    
                      A headless browser is a browse ...                     
                  
                  
                  
                    
                      Yulia Taylor                    
                    
                      2026-02-07
                    
                  
                
                
                
                  
                    
                  
                  
                    Best Anti-Detection Browser
                    
                      Xyla Huxley Last updated on   2025-01-22   10 min read  […]                    
                  
                  
                  
                    
                      Unknown                    
                    
                      2026-02-06
                    
                  
                
                
                
                  
                    
                  
                  
                    What is a UDP proxy?
                    
                        Xyla Huxley Last updated on   2025-01-22   10 min rea […]                    
                  
                  
                  
                    
                      Unknown                    
                    
                      2026-02-06
                    
                  
                
                
                
                  
                    
                  
                  
                    What is geographic pricing?
                    
                      Xyla Huxley Last updated on   2025-01-22   10 min read  […]                    
                  
                  
                  
                    
                      Unknown                    
                    
                      2026-02-05
                    
                  
                
                
                
                  
                    
                  
                  
                    How to Use Proxies in Python: A Practical Guide
                    
                      Xyla Huxley Last updated on   2025-01-28   10 min read  […]                    
                  
                  
                  
                    
                      Unknown                    
                    
                      2026-02-05
                    
                  
                
                
                
                  
                    
                  
                  
                    What Is an Open Proxy? Risks of Free Open Proxies
                    
                      Xyla Huxley Last updated on   2025-01-22   10 min read  […]                    
                  
                  
                  
                    
                      Unknown                    
                    
                      2026-02-04
                    
                  
                
                
                
                  
                    
                  
                  
                    What Is a PIP Proxy? How It Works, Types, and Configuration?
                    
                      Xyla Huxley Last updated on   2025-01-22   10 min read  […]                    
                  
                  
                  
                    
                      Unknown                    
                    
                      2026-02-04
                    
                  
                
                
                
                  
                    
                  
                  
                    TCP and UDP: What’s Different and How to Choose
                    
                      Xyla Huxley Last updated on   2026-02-03   10 min read  […]                    
                  
                  
                  
                    
                      Unknown                    
                    
                      2026-02-04


  
  
    
      
        
        8 THE GREEN, STE A, DOVER, DE 19901, USA
      
      
      
        
          Get in touch
          
        
        
          Follow us
          
        
      
    
    
    
      
        Company
        About UsAffiliate ProgramPartnersUse
              CasesNewsroom
      
      
        Proxies
        Residential
              ProxiesMobile
              ProxiesStatic ISP
              ProxiesDatacenter
              ProxiesHigh-Bandwidth
              Proxies
      
      
        Scrapers
        Web Scraper
              APISERP APIWeb UnlockerScraping BrowserDatasets
      
      
        Get Started
        Quick Start GuidesFAQPublic APIIntegrationsBlogDocumentation
        
      
    
  
  
  
    
      Get in touch
      
    
    
      Follow us
      
    
  
  
  
    
      Privacy PolicyService AgreementRefund Policy
      
    
    

  
  
  
    
      
        
        美国特拉华州多佛市 The Green 8号 A套房，邮编19901
      
      
      
        
          联系我们
          
        
        
          关注我们
          
        
      
    
    
    
      
        公司
        关于我们联盟计划合作伙伴使用案例新闻室
      
      
        代理
        住宅代理移动代理静态ISP代理数据中心代理高带宽代理
      
      
        爬虫
        网页抓取APISERP API网页解锁器抓取浏览器数据集
        
      
      
        开始使用
        快速入门指南常见问题公共API集成博客文档
        
      
    
  
  
  
    
      联系我们
      
    
    
      关注我们
      
    
  
  
  
    
      隐私政策服务协议退款政策

Element	CSS Selector	Description
Author	.a-profile-name	Reviewer's name
Rating	.review-rating > span	Star rating (e.g., "5.0")
Date	.review-date	Review timestamp
Title (Local)	.review-title span:not([class])	Review headline
Content (Local)	.review-text	Full review body
Images	.review-image-tile	Attached photos

How to Scrape Amazon Reviews with Python: Ultimate Guide for 2026

Why Scrape Amazon Reviews?

Key benefits

Setting Up Your Python Environment for Web Scraping Reviews

Essential Libraries for Python Web Scraping

Configuring Headers and Proxies

Understanding Amazon's Review Page Structure

Key elements include:

Step-by-Step Guide to Building an Amazon Review Scraper in Python

Sending Requests and Parsing HTML

Extracting Key Review Data

Handling Local and Global Reviews

Full Python Code for Scraping Amazon Reviews

Enhancing Your Scraper with Thordata's Advanced Tools

Using Thordata Proxies for Reliable Scraping

Best Practices and Legal Considerations for Web Scraping Reviews

Conclusion

Looking for Top-Tier Residential Proxies?

您在寻找顶级高质量的住宅代理吗？

Related Articles