STEP 9. CLEANING THE DATA FRAME

hasibaakterss3309 · Post by **hasibaakterss3309** » Sun Jan 19, 2025 10:18 am

The tags are found and now we can create a for loop .

# Create a for loop to fill mydata
for j in table1.find_all(‘tr’)[1:]:
row_data = j.find_all(‘td’)
row = [i.text for i in row_data]
length = len(mydata)
mydata.loc[length] = row

Result:

Data frame

Next, once the data frame has been successfully honduras telegram database created, we can delete and clean up unnecessary rows. In our case, we will delete the index 0-6, 222-228, then reset the index to its original state and delete the "#" column.

# Drop and clearing unnecessary rows
mydata.drop(mydata.index[0:7], inplace=True)
mydata.drop(mydata.index[222:229], inplace=True)
mydata.reset_index(inplace=True, drop=True)

# Drop “#” column
mydata.drop(‘#’, inplace=True, axis=1)
Result:

Final data frame
STEP 10. EXPORT DATA TO CSV FILE
Once the data frame is ready, we can export the data to a CSV file.

# Export to csv
mydata.to_csv(‘covid_data.csv’, index=False)