Web scraping library and command-line tool for text discovery and extraction (main content, metadata, comments)