Problem with selecting text

Questions and answers about anything related to Helium Scraper
Post Reply
dg151
Posts: 3
Joined: Fri May 27, 2011 7:28 pm

Problem with selecting text

Post by dg151 » Fri May 27, 2011 7:39 pm

Hi admin,

I'm trying to use HeliumScraper to scrape the search results generated at http://comluv.com/search-results. But it appears the site's search results are dynamically generated using Google search functionality and are probably displayed in an iframe/frame. Thus, i'm having problem with selecting the url or title text in the search results. I tried switching on 'selection mode' and tried clicking on the urls and titles to select them for extraction but i'm unable to select any of them. Could you suggest a workaround for this issue? Thanks.

webmaster
Site Admin
Posts: 521
Joined: Mon Dec 06, 2010 8:39 am
Contact:

Re: Problem with selecting text

Post by webmaster » Fri May 27, 2011 9:43 pm

Hi,

Here is the actual URL of that IFRAME:

http://www.google.com/custom?hl=en&clie ... art=0&sa=N

What I did to get it was this: Navigate to your page, then search anything that will yield more than 1 result pages. Then in an actions tree, add a "Wait" action that waits for 5 seconds, press play and pause right away so Helium Scraper sends navigation logs (Helium Scraper only logs while playing or paused; perhaps this is something that should be changed in a future release). Then while paused, click on the "Next" button to go to the next page and the log will show the URL above. The log can be accessed from Project -> View Log.

You should be able to get your results from that page.
Juan Soldi
The Helium Scraper Team

dg151
Posts: 3
Joined: Fri May 27, 2011 7:28 pm

Re: Problem with selecting text

Post by dg151 » Sun May 29, 2011 9:20 pm

Hey Juan,

Thanks for the prompt reply and assistance. Now that i'm able to use a non-iframe search page, i am looking to use the custom-written project file that you released here: viewtopic.php?f=8&t=100. Basically my idea is to open the search url link in HeliumScraper browser, load the keywords into the SearchTerms table and then let HeliumScraper conduct the searches automatically. Only problem is i am having problems trying to select the input box and the search button. How can i go about selecting them? So far, i'm familiar with selecting urls and title texts (learnt them from the basic tutorial) but i'm not sure how to go about selecting the input box and the search button as kinds. Could you help me out on this? Thanks mate.

webmaster
Site Admin
Posts: 521
Joined: Mon Dec 06, 2010 8:39 am
Contact:

Re: Problem with selecting text

Post by webmaster » Mon May 30, 2011 1:48 am

Hello,

To select buttons or input boxes you do it the very same way as with text. You will notice that buttons and input boxes won't turn purple, but if you look at the bottom in the selection panel, you will see the item listed in the list of selected items.
Juan Soldi
The Helium Scraper Team

dg151
Posts: 3
Joined: Fri May 27, 2011 7:28 pm

Re: Problem with selecting text

Post by dg151 » Mon May 30, 2011 5:22 am

Hi Juan,

I have managed to rectify the problem i had with selecting the search box and search button. The project "AutoSearch.hsp" is now working perfectly fine. Unfortunately, i'm now having problems trying to integrate it with the other custom-written projects "GoThroughAllPages.hsp" and "DeCaptcher.hsp". I imported the "GoThroughAllPages.hsp" project to the current "AutoSearch.hsp" project and tried executing the "Go Through All Pages" action tree in the "Execute Actions Tree" action tree. The strange thing is that for the first search term, the scraper went through all the respective search results pages for that particular search term, but for the subsequent search terms, the scraper just stopped at the first search result page and moved on to the next search term. I think the problem lies in where i inserted the Execute tree: Go Through All Pages.

Also, i have no ideas on how to go about smoothly integrating the "DeCaptcher.hsp" project into the current "AutoSearch.hsp". Lastly, i will need the scraper to extract out all the urls of the search results for every search term and i have no idea on where to insert the extract action. Do you think you could help me integrate the desired functions into the "AutoSearch.hsp" project? I would be willing to pay you a token amount for your help and time. Thanks mate. :)

webmaster
Site Admin
Posts: 521
Joined: Mon Dec 06, 2010 8:39 am
Contact:

Re: Problem with selecting text

Post by webmaster » Mon May 30, 2011 5:10 pm

Hi,

Here is a project that you can use as a template. A few things to notice:

My starting page yields 100 results per page. This will make extraction faster (less navigations). If you look at the URL you will find a "num=100" there. You cannot use stuff like "num=10000". 100 is the maximum.

There are 2 "Execute tree: Solve Captcha If Needed" actions. One is for the one that could appear after searching, and the other one for the one that could appear after turning the page. You will need to double click each of them to enter your credentials.

The reason why it was skipping pages in some of your search terms was probably because it was not finding the "Next" button there. The thing to do in this case is try to search for any of those terms and try to select whatever kind selects the "Next" button by clicking on the "Select kind in browser". If not, add it to your kind with the "Add selection to his kind" button.

Hope that helped.
Attachments
SearchTemplate.hsp
(1.46 MiB) Downloaded 737 times
Juan Soldi
The Helium Scraper Team

Post Reply