This repository contains a copy of machine learning datasets used in tutorials on MachineLearningMastery.com.
This repository was created to ensure that the datasets used in tutorials remain available and are not dependent upon unreliable third parties.
All regression and classification problem CSV files have no header line, no whitespace between columns, the target is the last column, and missing values are marked with a question mark character ('?').
In many cases, tutorials will link directly to the raw dataset URL, therefore dataset filenames should not be changed once added to the repository.
Datasets
This section provides a summary of the datasets in this repository.
Binary Classification Datasets
Breast Cancer (Wisconsin) (breast-cancer-wisconsin.csv)
Breast Cancer (Yugoslavia) (breast-cancer.csv)
Breast Cancer (Haberman's) (haberman.csv)
Bank Note Authentication (banknote_authentication.csv)
Horse Colic (horse-colic.csv)
Ionosphere (ionosphere.csv)
Pima Indians Diabetes (pima-indians-diabetes.csv)
Sonar Returns (sonar.csv)
German Credit (german.csv)
Credit Card Fraud (creditcard.csv.zip)
Adult Income (adult-all.csv)
Mammography (mammography.csv)
Oil Spill (oil-spill.csv)
Phoneme (phoneme.csv)
Multiclass Classification Datasets
Glass Identification (glass.csv)
Iris Flower Species (iris.csv)
Wheat Seeds (wheat-seeds.csv)
Wine (wine.csv)
Ecoli (ecoli.csv)
Thyroid Gland (new-thyroid.csv)
Regression Datasets
Boston Housing (housing.csv)
Auto Insurance Total Claims (auto-insurance.csv)
Auto Imports Prices (auto_imports.csv)
Abalone Age (abalone.csv)
Wine Quality Red (winequality-red.csv)
Wine Quality White (winequality-white.csv)
Univariate Time Series Datasets
Daily Minimum Temperatures in Melbourne (daily-min-temperatures.csv)
Daily Maximum Temperatures in Melbourne (daily-max-temperatures.csv)
Daily Female Births in California (daily-total-female-births.csv)
Monthly International Airline Passengers (monthly-airline-passengers.csv)
Monthly Armed Robberies in Boston (monthly-robberies.csv)
请发表评论