Multi-Level Scraping with Drop Menu Navigation

Questions and answers about anything related to Helium Scraper
Post Reply
patrick95350
Posts: 1
Joined: Mon Oct 20, 2014 11:28 pm

Multi-Level Scraping with Drop Menu Navigation

Post by patrick95350 » Mon Oct 20, 2014 11:56 pm

I'm having trouble using the multilevel extraction technique in combination with the "select from each item in a list" premade.

I'm trying to scrape a CA state records webpage (http://www.ceqanet.ca.gov/QueryForm.asp)

The problem is that I need to select across 2 different drop down menus ("Lead Agency" and "Document Type"). I need to select each menu option from Menu 1 for each option in Menu 2. To make matters worse, the navigation is a webform, so I have to submit the form after selecting the record, then return back to the query page after each scraping each results page.

I can get the select premades to work, one nested in the other, and use a javascript submit command to go to each results page, and scrape the page. The problem is the results page doesn't give me information on the menu options I used to get there, so the scraped data has no context.

I tried to use the multilevel IDs like in the youtube tutorial, extracting the kinds' value tags, but when I do, no information shows up in the ID fields. If I test it just extracting the select values, but not submitting the form or extracting the actual content, the ID fields are populated. With the submit command in place, it stops working again.

If anyone can help, the project file is here:
https://dl.dropboxusercontent.com/u/105 ... er%202.hsp

Many thanks!

Post Reply