From the course: Web Scraping with Python
Unlock the full course today
Join today to access over 22,500 courses taught by industry experts or purchase this course individually.
Solution: Scraping news sites - Python Tutorial
From the course: Web Scraping with Python
Solution: Scraping news sites
(upbeat music) - [Instructor] Like I said before, I decided to scrape news articles from Associated Press, CNN, and Yahoo News. To be honest, I got a little lucky with these sites. I did scope out a few different sources and picked ones that seemed moderately easy to scrape, but sometimes you really don't know what you're going to get until you actually build it. So everything went pretty smoothly, all things considered. I created a NewsArticle item, and that contains the title, description, date, author, full text, all that stuff. CNN was probably the most straightforward site to scrape. The only tricky thing I had to do was something that we already covered in chapter one, and that's use the metadata to get information. I was able to get really clean versions of the description and date published from the metadata in the header. The author's name was also there, but still required a little cleaning, which I did in the…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.