<aside> ❓ Study the structure of the HTML file: https://parsons.nyc/aa/m08.html
Write a set of instructions (in narrative form, list form, and/or pseudo code) for how you could parse this HTML file with code. Be clear about how you would organize this data into tables(s) consisting of rows and columns, and how you might use the structure of the HTML to find delimiters between the "variables"/columns. A successful submission includes an exhaustive list of the data you would collect and how you could use code/an algorithm to parse this data.
This set of instructions should be clear enough that if I provided the HTML file and your instructions to a data analyst, they would be able to successfully parse this data as you intended.
</aside>
In inspecting the HTML file, it looks like each group has its own <tr> section. Within each <tr> section are three <td> elements. This is how I would parse the data within each of the three <td> elements (from left to right) into its own "variables" or columns:
Left <td> element:
HTML tag | Corresponding variable |
---|---|
If there's an <h4> | Name of location |
<b> | Name of group |
Anything in quotation marks ("") after the <b> element | Address |
If there's a <div class="detailsBox"> that follows | Additional Notes |
If there's a <span style> that follows | Wheelchair Access |
Center <td> element:
HTML tag | Corresponding variable |
---|---|
If the <b> has a day of the week in its content | Day of the Week |
A time in quotation marks ("") before <b>to</b> | Starting Time |
A time in quotation marks ("") after <b>to</b> | Ending Time |
If the <b> content equals "Meeting Type" | Meeting Type |
Right <td> element:
HTML tag | Corresponding variable |
---|---|
<a href> | Directions |