Page 1 of 1

Navigation fails to complete

Posted: Fri Jun 01, 2012 4:29 am
by Kyen
First off, I wanted to say that the scraper is so far amazing, and I'm definitely going to buy it if I can get over this small problem I'm having now.

I'm scraping Fandango for movie times for an app I'm working on, and I'm searching incrementally by zip codes. Not all Zip codes are associated with theaters, though, and those that aren't show a webpage notifying you that there are no results. 97% of the time that the scraper encounters these pages, it breaks, claiming to indefinitely load my JavaScript (three lines that navigate to the next zip code).

I haven't been able to find a way around this, and help would be greatly appreciated.

Here's a copy of the log when navigation fails:

Code: Select all

Navigation	Navigation Started		11:12:56 PM	
Navigation	Partial Navigation Started	11:12:56 PM	
Navigation	Partial Navigation Started	11:12:56 PM	
Navigation	Partial Navigation Completed	11:12:56 PM	
Navigation	Partial Navigation Started	11:12:57 PM	
Navigation	Partial Navigation Completed	11:12:57 PM	
Navigation	Partial Navigation Started{"fpc":"bfcd1ff-137a54c2579-7f345145-134","sessionID":"1338523974755.19808","sourceURL":"","hostname":"","location":"%252F36693_movietimes","publisher":"9cfd4f08-0ecf-4989-a9da-19c9693cda97","shareHash":"sthash.aXW7bu7L","incomingHash":"","refDomain":"","refQuery":"36692_movietimes"}/shareInfo={"url":"","sharURL":"","source":"share4x","title":"","ts1338523974778.0":""}	11:12:57 PM	
Navigation	Partial Navigation Completed{"fpc":"bfcd1ff-137a54c2579-7f345145-134","sessionID":"1338523974755.19808","sourceURL":"","hostname":"","location":"%252F36693_movietimes","publisher":"9cfd4f08-0ecf-4989-a9da-19c9693cda97","shareHash":"sthash.aXW7bu7L","incomingHash":"","refDomain":"","refQuery":"36692_movietimes"}/shareInfo={"url":"","sharURL":"","source":"share4x","title":"","ts1338523974778.0":""}	11:12:57 PM	
Navigation	Partial Navigation Started	11:12:57 PM	
Navigation	Partial Navigation Started	11:12:57 PM	
Navigation	Partial Navigation Completed	11:12:57 PM	
Navigation	Partial Navigation Completed	11:12:57 PM	
Navigation	Partial Navigation Started,signed_request,code&sdk=joey	11:12:57 PM	
Navigation	Partial Navigation Completed,signed_request,code&sdk=joey	11:12:57 PM	
Navigation	Partial Navigation Started;wi.728;hi.90/01/1603871?click=;h=v8/3c87/3/0/*/c;257776846;0-0;0;10934597;3454-728/90;48482089/48481582/1;u=tile*1^size*728x90,960x150^cid*0^pm*1;~okv=;tile=1;entry=no;dcopt=ist;gal=;mv=;dma=mobilepensacola;zip=36693;cid=0;tid=;genre=;rt=;pname=;sweeps=;promo=;env=production;entry=no;sz=728x90,960x150;pm=1;abtest=0;u=tile*1^size*728x90,960x150^cid*0^pm*1;bsg=102437;bsg=103268;bsg=103666;bsg=103670;bsg=111304;;~sscs=?	11:12:58 PM	
Navigation	Partial Navigation Started	11:12:58 PM	
Navigation	Partial Navigation Started	11:12:58 PM	
Navigation	Partial Navigation Completed;wi.728;hi.90/01/1603871?click=;h=v8/3c87/3/0/*/c;257776846;0-0;0;10934597;3454-728/90;48482089/48481582/1;u=tile*1^size*728x90,960x150^cid*0^pm*1;~okv=;tile=1;entry=no;dcopt=ist;gal=;mv=;dma=mobilepensacola;zip=36693;cid=0;tid=;genre=;rt=;pname=;sweeps=;promo=;env=production;entry=no;sz=728x90,960x150;pm=1;abtest=0;u=tile*1^size*728x90,960x150^cid*0^pm*1;bsg=102437;bsg=103268;bsg=103666;bsg=103670;bsg=111304;;~sscs=?	11:12:59 PM	
Navigation	Partial Navigation Completed	11:12:59 PM	
Navigation	Partial Navigation Completed	11:12:59 PM	
Thanks in advance for the help!

Re: Navigation fails to complete

Posted: Fri Jun 01, 2012 8:12 pm
by Kyen
After some more testing, I've found that it isn't Helium, but a flaw in the pages themselves. They don't complete loading, sometimes even for a few minutes, and javascript's page navigation mechanism doesn't return complete until the next page loads, it seems. I know that this probably won't be a common problem, but if you could add some way of navigating to a dynamic URL that doesn't appear on the page, perhaps using variable functions, that would be awesome. In any case, thanks again!

Re: Navigation fails to complete

Posted: Fri Jun 01, 2012 10:49 pm
by webmaster
Hi Kyen,

Try setting a lower navigation timeout in Project -> Options -> Navigation Timeout. Note that this timeout is only used while running an extraction. You can find more info in the Project Options section in the documentation.

What do you mean by navigating to a dynamic URL that doesn't appear on the page?

Re: Navigation fails to complete

Posted: Fri Jun 01, 2012 11:02 pm
by Kyen
Thanks. I saw the Timeout earlier, but for some reason didn't think to adjust it. For the dynamic URL, I'm having to use Javascript to navigate between, say, and, incrementally. It's not a hassle to use JavaScript to do this, but I feel that having some way to do that would help Helium be a little more versatile.

Thanks for the help with the Timeout - it's working fine now!

Re: Navigation fails to complete

Posted: Fri Jun 01, 2012 11:07 pm
by webmaster
You can do this with the URL Variations premade, which can be accessed from File -> Online Premades or New Action -> Execute Actions Tree -> More... to automatically add it as an action. There is more detailed information on how to use it in the project's description.

Re: Navigation fails to complete

Posted: Wed Jul 30, 2014 3:27 am
by GetTeck
We are trying to scrape the page and the pages never finish loading. Because of this the application errors out when trying to scrape the data saying that the object is not available.

Adjusting the timeout does not resolve any of the issues. There are no extensive load time in IE, Chrome, Firefox, etc. - Just in the Helium Scraper product.

We spent a lot of time learning the product and creating and testing the project; however, this defect in the program appears to render the application completely useless.

There are a great many complaints regarding this issue, and we have no intention of buying a program that does not work. There also appears to be no intention on the part of PHPBB to resolve the issue.

Are there any plans to resolve this issue or is the program simply being abandoned?