From the course: SQL Server Machine Learning Services: Python

What is machine learning services?

From the course: SQL Server Machine Learning Services: Python

Start my 1-month free trial

What is machine learning services?

- [Instructor] Let's kick things off with a discussion of exactly what Machine Learning Services means in the context of SQL Server and how that relates to the Python programming language. First, let's understand what Python is. Python is an open source programming language that has widespread adoption in the world of data science. This is because it includes a large standard library of functionality, but is also highly extensible. Additional libraries and packages can easily be added to gain specific functionality, such as generating charts and graphs, performing image analysis, or conducting sentiment analysis and natural language processing on text. In the past, in order to use Python to analyze data stored in a SQL Server database, data scientists would need to export that data first, but that changed in 2017 when Microsoft added support for in-database processing with Python to all editions of SQL Server. Support for Python joins the R and Java programming languages in a SQL Server feature that Microsoft collectively calls Machine Learning Services. Now while the name Machine Learning Services implies support for developing advanced machine learning models with your data, and you can certainly do that with Python, it's more simply the ability to run any scripts on your relational data from basic arithmetic on up. And with the large number of add-on packages that Python supports, the doors are wide open to what you could do with this capability. So don't let the name Machine Learning Services scare you off that this is some high level, complex, and niche functionality. Support for Python is flexible and can be used by anyone from an entry level analyst up to a seasoned pro with many years of Python experience under their belt. Python support in Machine Learning Services provides a number of important new capabilities when it comes to processing large amounts of data. I mentioned that processing now occurs in-database and here's three reasons why that's significant. First, because data no longer needs to be extracted from the database before it can be processed through a script, it can maintain the benefits of SQL Server's security wrapper. The data stays in the database where access could be controlled, logged, and audited. This is particularly advantageous when working with sensitive financial or personal data where compliance with handling protocols needs to be strictly maintained. Second, copying large amounts of data out of a database could put significant strain on network resources. Because of this, in the past, many data analysts would only process a smaller, representative sample of the data rather than working with the full data set. Since the data no longer needs to leave SQL Server first, those restrictions are removed, which opens the doors to fuller and more complete analyses. And finally, because the data stays in place, Python scripts can take full advantage of the performance benefits brought by SQL Server technologies, such as in-memory tables and columnstore indexes. In order to run Python code in SQL Server, you'll use the same query editors that you're probably already familiar with. You don't need any additional software or application development tools to get started. For most people, that means opening up a new query window in SQL Server Management Studio or Azure Data Studio and just simply start typing the code that you want to execute. Further, you can save your Python scripts right inside of the database as stored procedures so that other database users can easily execute your scripts without having to write the code themselves. And since Python support is available in every edition of SQL Server starting in 2017, even the free to use Express edition, no matter the scale of your application you could take advantage of Python integration.

Contents