A basic k-means implementation in pyspark using RDDs. The used points are latitude and longitude coordinates based on IP addresses of accesses to a web server.