The tags are found and now we can create a for loop .
# Create a for loop to fill mydata
for j in table1.find_all(‘tr’)[1:]:
row_data = j.find_all(‘td’)
row = [i.text for i in row_data]
length = len(mydata)
mydata.loc[length] = row
Result:
Data frame
Next, once the data frame has been successfully honduras telegram database created, we can delete and clean up unnecessary rows. In our case, we will delete the index 0-6, 222-228, then reset the index to its original state and delete the "#" column.
# Drop and clearing unnecessary rows
mydata.drop(mydata.index[0:7], inplace=True)
mydata.drop(mydata.index[222:229], inplace=True)
mydata.reset_index(inplace=True, drop=True)
# Drop “#” column
mydata.drop(‘#’, inplace=True, axis=1)
Result:
Final data frame
STEP 10. EXPORT DATA TO CSV FILE
Once the data frame is ready, we can export the data to a CSV file.
# Export to csv
mydata.to_csv(‘covid_data.csv’, index=False)
STEP 9. CLEANING THE DATA FRAME
-
- Posts: 696
- Joined: Thu Jan 02, 2025 7:09 am