Supervised machine learning using a data set of 2500 ham and 500 spam emails. Data is also split into train and test sets of various sizes to test the classifier's efficiency. (Python)