K-means clustering - find clusters of documents, label them with appropriate classifier. later use a supervised algorithm to predict new documents