Short project to de-duplicate records based on fuzzy matching and machine learning, using Python and ElasticSearch