Basically, I want to create a single table, but each row within that table needs to contain information from three different html pages. Before I get into too much detail about it, is there a straightforward way to do this with Helium Scraper?
If it helps for finding the right row, there is one unique piece of data (call it 'idNumber') that could be used as a primary key for each row. I can get the relevant idnumber directly from each of the three html pages for cross-referencing, although I will have to define three different kinds for it because the number is in a different place in the html structure in each of the three pages.
Data from different pages in the same table/row?
Re: Data from different pages in the same table/row?
Hi,
Each row can only be extracted from the same page at the same time when using a "Extract" action. What you can do is create two "Extract" actions that extract each to a different table and extract the "idNumber" to both tables so then you can use a SQL JOIN to join them together into a single table. If the "idNumber" is not a number you would need to use a SQL WHERE, such as "SELECT * FROM [Table1], [Table2] WHERE [Table1.idNumber] = [Table2.idNumber]".
If you have never used SQL, I recommend taking a look at this tutorial. Is actually a very straightforward language as you can probably tell by looking at my example above.
Another option would be to extract the rows from JavaScript code to a single table, but this might be more complicated.
Let me know if you need further help.
Each row can only be extracted from the same page at the same time when using a "Extract" action. What you can do is create two "Extract" actions that extract each to a different table and extract the "idNumber" to both tables so then you can use a SQL JOIN to join them together into a single table. If the "idNumber" is not a number you would need to use a SQL WHERE, such as "SELECT * FROM [Table1], [Table2] WHERE [Table1.idNumber] = [Table2.idNumber]".
If you have never used SQL, I recommend taking a look at this tutorial. Is actually a very straightforward language as you can probably tell by looking at my example above.
Another option would be to extract the rows from JavaScript code to a single table, but this might be more complicated.
Let me know if you need further help.
Juan Soldi
The Helium Scraper Team
The Helium Scraper Team
Re: Data from different pages in the same table/row?
Thanks! The SQL JOIN will work.