Ask questions via twitter! Message any question to @answers on twitter. We'll publish the question and send you a reply each time there's a new answer.
Next Question

Question

 
January 15, 2009 04:12 AM

How can I get a list of random ASIN numbers?

ASIN is a unique identifier that Amazon uses for it's products. I would like a way to collect a bunch of random ASIN's for analysis. I tried creating random strings but each ASIN is different enough that a brute force guess and check isn't practical.
Interesting Question?  Yes (0)   No (0)   
Email to a friend | RSS
 
 

 
   No Best Answer Selected
 
 


Answers (1)

Sort By
 
January 15, 2009 04:22 AM
How about a quick crawler/scraper? The code is not hard to get a web page, find the links (<a href> tags), look for tags of a certain format representing products, then parse out the ASIN. Then repeat for a random link on each page that fits your criteria.

They won't be truly "random" in that there is likely to be some relation between the products, but it's a start. Run the crawler long enough and you'll have a big enough sample that you can then sample *that* and it'll feel more random.

Helpful Answer?  (0)   (0)    Tip shakespearegeek for this answer
Permalink | Report
   Reply  
 
 
 
January 15, 2009 04:26 AM
Well they do have an API that would be a bit more efficient than actually scraping the page. There are things called browseNodes but that's about asfar as I've gotten.

Report
 
 
 
January 15, 2009 02:11 PM
You'd be surprised about the efficiency of APIs, you have to be in on the program and get your keys, blah blah blah, which I assume you've already done given your comment. And then you're limited to what their API lets you do. But it doesn't take any special access to scrape the page, and you can use whatever tools you can get your hands on. Many a business is founded on scraping content and reassembling it into a more useful structure. I've been at 2 companies now that do that, so crawler/scrapers are just another part of the toolbox to me.

If you've got any shell programming skills, get yourself a copy of the "wget" program, which automatically pulls down html pages to your file system. Once you've got it local (one hit to Amazon, so it's not even like they notice), you use the language of your choice to break it up and analyze it (I prefer Ruby). Part of what you'll find in any given page is more links. So when you need more stuff to analyze, just recursively go get more pages.

http://en.wikipedia.org/wiki/Wget

Report
 
 

Answer this Question


Ask a Question


140 characters left
Top of Page
Buy Mahalo Dollars with Credit Card or PayPal

Top Members

This Week All Time
  • buddawiggi
    buddawiggi
    2nd Degree Black Belt
    27933 Points
    M$806.66 Earned
  • opher
    opher
    Purple Belt
    4757 Points
    M$203.72 Earned
  • annelisle
    annelisle
    Purple Belt
    3308 Points
    M$99.72 Earned
   See All
 

Most Popular Tags

mahalo(1635)
iphone(466)
music(464)
google(360)
food(325)
online(298)
beer(279)
money(264)
movies(262)
apple(253)
aotd(235)
health(220)
video(208)
free(206)
dog(205)
   See All
 

Categories

Welcome New Members


 
 
Mahalo Dollars are the currency of Mahalo Answers.

Each Mahalo Dollar costs $1.

Once you earn more than 40 Mahalo Dollars, you can request to be paid via PayPal. Each Mahalo Dollar is currently worth $0.75 when paid out via PayPal. Learn More

 
 

Please log in to use this function.