In this video, learn how to read text files using Python.
- [Instructor] In this chapter, … we will explore reading data into a corpus and exploring it. … The code samples are available in the notebook … zero two XX reading data dot ipynb. … For exercises in this course, … we use a file called Spark Course Description dot text. … This is available as part of your course material. … Let us explore its content. … It contains description of a course on Apache Spark. … All text data needed for processing have to be acquired … from a data source. … In this code example, we will read a text file … into a python variable. … This is standard python and does not use the NLTK library. … We read the Spark Course Description dot txt … into a variable called file data … and then we print the first 200 characters of the file. … Let us run the code now. … In general, data can be acquired from various sources, … including files, databases, or strings. … There are several python packages and tools … that help to get data from these sources. … We do not intend to focus on those areas in this course. …
- Text mining today
- Reading text files using Python
- Cleansing text data
- Build n-grams databases for text predictions
- Preparing TF-IDF matrices for machine learning
- Scaling text processing for performance
Skill Level Intermediate
Processing Text with R Essential Trainingwith Kumaran Ponnambalam55m 57s Intermediate
1. Text Mining
2. Reading Text
3. Text Cleansing and Extraction
4. Advanced Text Processing
5. Best Practices
- Mark as unwatched
- Mark all as unwatched
Are you sure you want to mark all the videos in this course as unwatched?
This will not affect your course history, your reports, or your certificates of completion for this course.Cancel
Take notes with your new membership!
Type in the entry box, then click Enter to save your note.
1:30Press on any video thumbnail to jump immediately to the timecode shown.
Notes are saved with you account but can also be exported as plain text, MS Word, PDF, Google Doc, or Evernote.