selection mode

Let us know if anything goes wrong with our baby :)
relicon
Posts: 14
Joined: Sun Oct 02, 2011 3:23 pm

selection mode

Post by relicon » Sun Oct 02, 2011 4:34 pm

There are times when Helium Scraper scrapes data it somehow skips some of the data, even though it can be selected and recognized by the Kind.

Like for example, I was scraping the Prices and item numbers in ebay. Some prices were skipped and some item numbers were skipped also, however, they can be selected and recognized by the Kind that I added.

webmaster
Site Admin
Posts: 521
Joined: Mon Dec 06, 2010 8:39 am
Contact:

Re: selection mode

Post by webmaster » Mon Oct 03, 2011 4:59 pm

Hi,

If your kind is properly selecting them, this is most likely because the item loads dynamically (with AJAX) and has not fully loaded when you attempt to extract it. One solution would be to use the Force Select premade. If you wish, send me your project and I'll see what the problem is.
Juan Soldi
The Helium Scraper Team

relicon
Posts: 14
Joined: Sun Oct 02, 2011 3:23 pm

Re: selection mode

Post by relicon » Mon Oct 03, 2011 8:55 pm

Hi Webmaster,

You are definitely right about the Ajax code that Ebay uses. I have attached the project below. By the way, how do you add the premade "Force Select"?

Thanks a lot.
Attachments
Ebay Item Number Scraper # 1.hsp
(385.93 KiB) Downloaded 1045 times

webmaster
Site Admin
Posts: 521
Joined: Mon Dec 06, 2010 8:39 am
Contact:

Re: selection mode

Post by webmaster » Mon Oct 03, 2011 10:14 pm

Hi,

Here is a version of your project that uses the Force Select action. This action can be used as you would use any other action from New action -> Execute actions tree -> More.... All you need to do is set it before your Extract action so that execution holds while the item number has not loaded. You can double click it to see how is configured.
Attachments
Ebay2.hsp
(516.97 KiB) Downloaded 1043 times
Juan Soldi
The Helium Scraper Team

relicon
Posts: 14
Joined: Sun Oct 02, 2011 3:23 pm

Re: selection mode

Post by relicon » Mon Oct 03, 2011 10:53 pm

Thanks a whole lot Webmaster, this is the one I've been looking for!

I also figured out that I can also add the "Wait" action.

There is only one thing though...

When I played the Ebay2.hsp, I checked the memory usage and it was increasing a lot.

Is there anyway, to lower the memory usage?

It seems to keep adding up. And then it would load very slowly.

The memory usage is constantly increasing and it never stops increasing.

You can know the memory usage by playing the Ebay2.hsp and then press Ctrl+alt+Delete in order to open the Task Manager window and then click "Processes" tab. You will find that helium scraper is constantly increasing its memory usage.

relicon
Posts: 14
Joined: Sun Oct 02, 2011 3:23 pm

Re: selection mode

Post by relicon » Mon Oct 03, 2011 11:03 pm

Oh wait!

I think I know how to solve this very easily.

I think I need to add a "Wait" action and then disable all of my javascript....

I'll let u know the results soon.

I believe this will prevent the memory usage from increasing.

webmaster
Site Admin
Posts: 521
Joined: Mon Dec 06, 2010 8:39 am
Contact:

Re: selection mode

Post by webmaster » Tue Oct 04, 2011 12:13 am

Hi,

Regarding the Wait action, it will work, but I'd recommend using the Force Select instead because on one hand you might need to wait less than you would on your Wait action, which would make your extraction process slower, or you might need to wait longer, which would cause the item number not to be extracted. The Force Select action will wait just as long as necessary.

Regarding the memory issue, how high did your memory get? The slower speed could be due to slower server response. However, we have experienced memory issues with Manta, which could be related to the one you are having. One possible solution is to install Internet Explorer 9 and perform the following change in Window's registry, which will force Helium Scraper to use the IE 9 component (instead of the IE 7 which is the default one):

  1. Click Start -> Run.
  2. Type regedit and press OK.
  3. Navigate to the "HKEY_CURRENT_USER\Software\Microsoft\Internet Explorer\Main\FeatureControl\FEATURE_BROWSER_EMULATION" folder from the left panel.
  4. Make sure the "FEATURE_BROWSER_EMULATION" folder is selected, and from the main menu, go to Edit -> New -> DWORD (32-bit) Value.
  5. Rename the new value added to Helium Scraper.exe.
  6. Double click Helium Scraper.exe.
  7. Select the Decimal base.
  8. Set the Value data to 9000 and press OK.
Only perform this changes after you have installed IE9.

Let me know how this goes.
Juan Soldi
The Helium Scraper Team

webmaster
Site Admin
Posts: 521
Joined: Mon Dec 06, 2010 8:39 am
Contact:

Re: selection mode

Post by webmaster » Tue Oct 04, 2011 12:14 am

The problem with disabling JavaScript is that it might affect Helium Scraper functionality. For instance, the Force Select action won't work because it uses JavaScript (I assume you mean disabling it from your Internet Options).
Juan Soldi
The Helium Scraper Team

relicon
Posts: 14
Joined: Sun Oct 02, 2011 3:23 pm

Re: selection mode

Post by relicon » Tue Oct 04, 2011 12:19 am

I will install IE9 and follow the instructions that you have provided.

And yes, you are right. When I disabled the Javascript, the Force Select didn't work and even though I added a "waiting time," it didn't scrape some of the item numbers.

relicon
Posts: 14
Joined: Sun Oct 02, 2011 3:23 pm

Re: selection mode

Post by relicon » Tue Oct 04, 2011 12:21 am

There is one more thing though....

What option do I add if there is no item number?

I found out that if it lacks the item number, it will say "Timeout reached when trying to select 'item number.' Execution will pause."

How do you make it continue without manually clicking play?

I was wondering that if it cannot Force Select the item number (because it's lacking in the page), it should skip it.

Is there any way to configure that?

Post Reply