From the course: Data Ingestion with Python
Unlock the full course today
Join today to access over 22,700 courses taught by industry experts or purchase this course individually.
What should be in schema - Python Tutorial
From the course: Data Ingestion with Python
What should be in schema
- I hope you're convinced that you need to have a Schema without doubt. But what should go in it? I say, everything has to make sense of the data. Here are some parts to consider, description, some text about what this data is. In our example, PGTN should be spelled out as Pick Gust Type. Types, what's the type of the data? Is it a integer, a float, a text, units. What are the units of the data? In our case, temperature is a tenth of a Celsius. Constraints, the lowest ever recorded was minus 89.2 or about 60% Celsius. The highest recorded was 57.8 celsius. However, if you measure agent temperature, those limits will differ. Constraints between fields, you can't have snow when temperature is above a certain point. Relation, what contains what? Is it a one to one or one to many relation. Anything that can help you make sense of the data should be there. Don't get to the state where you don't know what the piece of data means and how to check it's quality.
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.