数据挖掘 - 使用 Selenium 和 bs4 报废亚马逊时出错 - 吾爱随笔录

我正在使用 BeautifulSoup 和 webdriver 进行课程项目，以在亚马逊上报废一次性尿布，以获取商品名称、价格、评论、评级。

我的目标是有这样的东西，我将把这些信息分成不同的列：

 Diapers Size 4, 150 Count - Pampers Swaddlers Disposable Baby Diapers, One 
 Month Supply
   4.0 out of 5 stars
   1,982
   $43.98
  ($0.29/Count)

不幸的是，在出现 50 个数据后，我收到此消息：消息：no such element: unable to locate element: {"method":"css selector","selector":".a-last"}

这是我的代码：

URL = "https://www.amazon.com/s? 
k=baby+disposable&rh=n%3A166772011&ref=nb_sb_noss" 
driver = ('C:/Users/Desktop/chromedriver_win32/chromedriver.exe') 
driver.get(URL) html = driver.page_source soup = BeautifulSoup(html, "html.parser") 
df = pd.DataFrame(columns = ["Product Name","Rating","Number of 
Reviews","Price","Price Count"])

while True:
for i in soup.find_all(class_= "sg-col-4-of-24 sg-col-4-of-12 sg-col-4-of-36 
s-result-item sg-col-
4-of-28 sg-col-4-of-16 sg-col sg-col-4-of-20 sg-col-4-of-32"):
ProductName = i.find(class_= "a-size-base-plus a-color-base a-text- normal").text#.span.get_text
print(ProductName)
 try:
Rating = i.find(class_= "a-icon-alt").text#.span.get_text()
  except:
 Rating = "Null"
print(Rating)
try:
NumberOfReviews = i.find(class_= "a-size-base").text#.span.get_text()
 except:
 NumberOfReviews = "Null"
 print(NumberOfReviews)
 try:
Price = i.find(class_= "a-offscreen").text#.span.get_text()
except:
Price = "Null"
print(Price)
try:
PriceCount = i.find(class_= "a-size-base a-color-secondary").text#.span.get_text()
except:
PriceCount = "Null"
print(PriceCount)
df = df.append({"Product Name":ProductName, "Rating":Rating, "Number of 
Reviews":NumberOfReviews, 
"Price":Price, "Price Count":PriceCount}, ignore_index = True)
nextlink = soup.find(class_= "a-disabled a-last")
if nextlink:
print ("This is the last page. ")
break
else:
progress = driver.find_element_by_class_name('a-last').click()
subhtml = driver.page_source
soup = BeautifulSoup(subhtml, "html.parser")

不幸的是，我遇到了一条街区路，试图弄清楚为什么它没有采取 a_last。