Phase 2: What/How
Posted: Sat Dec 21, 2024 4:37 am
The question led the eye to an obvious element on Amazon: public reviews .
Looking at and analyzing a good amount of reviews - for some of the products on sale both on Amazon and provided by PTS srl - seemed like the way to go. But how to do it? By hand, reeling them off one by one? Unthinkable!
I then turned to the magical power of data free spain number for whatsapp scraping , techniques for harvesting specific morsels of information from websites. The recipe now had its ingredients:
implementing a data scraping script for public reviews on Amazon;
list of 4 most “popular” multifunction printers on the market, both supplied by PTS srl and on sale on Bezos' creation;
data parsing tools (data cleaning).
scraping from Amazon site
Running the script on nearly 1,000 public reviews across the 4 printer model pages, collecting data was quick and easy. But what the heck are you looking for in all those reviews now?
After a necessary cleaning of the collected data, which was in fact “dirty” (for example, reviews without complete sentences, but only with syncopated answers: “Ok”; “it works”; “arrived I like it thanks Amazon”; etc.), the path to follow was one: an analysis of the sentiment and of the most pertinent reviews:
not having yet found a reliable sentiment analysis tool, I entrusted the scanning of the positivity or negativity of the Amazon shopping experience to the human eye. One by one, therefore, the useful reviews left from the parsing (about 400) were read and categorized;
the sentiment categorization was done on a grid of 5 aspects related to the experience of purchasing and using the printer (technical assistance; delivery; product quality/functioning; toner and cartridges; training in using the product);
The reviews were then collected and divided into two groups: reviews from 1 to 3 stars (totally/very/moderately dissatisfied); reviews of 4 and 5 stars (very/totally satisfied);
A bag-of-words (a “bag” of identified keywords to draw from) was used to make a distinction between reviews from “professional” customers (business use of the printer) and “private” customers (home use of the printer).
Looking at and analyzing a good amount of reviews - for some of the products on sale both on Amazon and provided by PTS srl - seemed like the way to go. But how to do it? By hand, reeling them off one by one? Unthinkable!
I then turned to the magical power of data free spain number for whatsapp scraping , techniques for harvesting specific morsels of information from websites. The recipe now had its ingredients:
implementing a data scraping script for public reviews on Amazon;
list of 4 most “popular” multifunction printers on the market, both supplied by PTS srl and on sale on Bezos' creation;
data parsing tools (data cleaning).
scraping from Amazon site
Running the script on nearly 1,000 public reviews across the 4 printer model pages, collecting data was quick and easy. But what the heck are you looking for in all those reviews now?
After a necessary cleaning of the collected data, which was in fact “dirty” (for example, reviews without complete sentences, but only with syncopated answers: “Ok”; “it works”; “arrived I like it thanks Amazon”; etc.), the path to follow was one: an analysis of the sentiment and of the most pertinent reviews:
not having yet found a reliable sentiment analysis tool, I entrusted the scanning of the positivity or negativity of the Amazon shopping experience to the human eye. One by one, therefore, the useful reviews left from the parsing (about 400) were read and categorized;
the sentiment categorization was done on a grid of 5 aspects related to the experience of purchasing and using the printer (technical assistance; delivery; product quality/functioning; toner and cartridges; training in using the product);
The reviews were then collected and divided into two groups: reviews from 1 to 3 stars (totally/very/moderately dissatisfied); reviews of 4 and 5 stars (very/totally satisfied);
A bag-of-words (a “bag” of identified keywords to draw from) was used to make a distinction between reviews from “professional” customers (business use of the printer) and “private” customers (home use of the printer).