Dmoz-tddli.rar May 2026
Since DMOZ officially closed in March 2017, a significant portion of the URLs in this archive may lead to dead links or parked domains.
Unlike machine-generated lists, DMOZ data was curated by over 90,000 volunteer editors, making the classifications highly accurate for its time.
About Dataset. This is an url classification dataset from dmoz directory. There are 15 class for classification.
“DMOZ — the Open Directory Project — officially closed today. It marks the end of an era of humans trying to catalog the entire web.” Search Engine Land · 9 years ago
The data includes deep taxonomic paths (e.g., Science/Technology/Space ), which is excellent for testing multi-level classification algorithms. Weaknesses:
While there is no public "official review" for the specific file , it likely contains a subset or processed version of the DMOZ (Open Directory Project) dataset, frequently used in data science for URL classification or web-scraping research.
Since DMOZ officially closed in March 2017, a significant portion of the URLs in this archive may lead to dead links or parked domains.
Unlike machine-generated lists, DMOZ data was curated by over 90,000 volunteer editors, making the classifications highly accurate for its time.
About Dataset. This is an url classification dataset from dmoz directory. There are 15 class for classification.
“DMOZ — the Open Directory Project — officially closed today. It marks the end of an era of humans trying to catalog the entire web.” Search Engine Land · 9 years ago
The data includes deep taxonomic paths (e.g., Science/Technology/Space ), which is excellent for testing multi-level classification algorithms. Weaknesses:
While there is no public "official review" for the specific file , it likely contains a subset or processed version of the DMOZ (Open Directory Project) dataset, frequently used in data science for URL classification or web-scraping research.