Join Barton Poulson for an in-depth discussion in this video Web formats, part of Data Science Foundations: Fundamentals.
- [Voiceover] When you're getting data for your data science project, you may have it in a spreadsheet, you may have it in a relational database, but it may also be on the web. For that reason, you need to learn a little bit about what I call web formats. The idea here is that data science thrives on the Internet in terms of retrieving data and in terms of sharing data. There are a few things about this. The first is HTML, or HyperText Markup Language. The second is XML, or Extensible Markup Language.
You need to be able to navigate HTML in order to be able to scrape data or pull data from a web page. That's probably the most important purpose of learning HTML for a data scientist. Now, in terms of the actual data structure, one very common method is what's called XML or Extensible Markup Language. This is a data encoding that is simultaneously human-readable and machine-readable, which is not always the case. And it's a semistructured format.
- The demand for data science
- Roles and careers
- Ethical issues in data science
- Sourcing data
- Exploring data through graphs and statistics
- Programming with R, Python, and SQL
- Data science in math and statistics
- Data science and machine learning
- Communicating with data