This project was an assignment for the Data Forensics course (JM2040-M-6) at JADS for the master Data Science in Business and Entrepreneurship.
For this project we analyzed data we gathered from the Cocorico Marketplace that can be found on the Dark Web. The goal of this project is to provide insights in the drugs that are being sold on the marketplace and to provide a better understanding of drug sales on the Dark Web. We hope to find patterns in the data that can help Law Enforcement Agencies (LEAs) in their fight against drug trafficking.
To gather our data we used a crawler and two scrapers: one to gather product data and one to gather data on the categories. After scraping, the data was processed and analysed in Python using the Pandas and Regex libraries. The data was then visualized using Matplotlib and Seaborn.
For 35 days, we crawled the Cocorico Marketplace and gathered data on the products that were being sold. We collected data on the different drug categories and the individual products. The product data includes the price, availability, number of views, seller and its rating. We only crawled drug-related pages by analysing the URL structure and filtering out non-drug related pages.
We created two scrapers: one to gather product data and one to gather data on the categories. The scrapers were written in Python using the BeautifulSoup library and looped through all the crawled files. The product scraper collected data on the individual products, including the price, availability, number of views, seller and its rating. The category scraper collected data on the different drug categories and the number of products in each category.