Microsoft Professional Program for Data Science

As a first step in the data science learning track, and inside the Microsoft Virtual Academy, you can find the Microsoft Professional Program for Data Science course delivered through the online learning platform edX.


How it works?

This program, certified by Microsoft, is made up of 3 units with 10 courses (lasting about 8-12 hours each) and a project capstone. The Microsoft Program Certificate is achieved once all the courses have been verified and certified through the edX platform (in case of not being interested in this certification and as for the rest of MOOCs in the edX platform, they can also be done free of charge).

Each course runs for three months and starts at the beginning of a quarter. January—March, April—June, July—September, and October —December. The capstone runs for four weeks at the beginning of each quarter: January, April, July, October.

It is a dynamic program whose contents can change between different calls (in fact, since I got my certificate -February, 2018-, two new courses have been included, one has been deleted of the track and most of the rest has been updated).

What will you learn?

The different courses of the program are focused on acquiring the basic skills and knowledge of a data scientist with different programming languages and Microsoft tools. The main goals for each of the courses are the following:

Unit 1 - Fundamentals

  1. Introduction to Data Science (antes MS Data Science Orientation) -> Basic statistics concepts and MS Excel use for data analysis.

  2. Analyzing and Visualizing Data with Power BI / Analyzing and Visualizing Data with Excel -> Creation of data models, analysis and visualization using Microsoft Excel or Power BI tools.

  3. Analytics Storytelling for Impact (Curso nuevo) -> Application of storytelling principles to improve the development of reports and analytics presentations.

  4. Ethics and Law in Data and Analytics (Curso nuevo) -> Knowledge of foundational abilities in applying ethical and legal frameworks for applications of data analysis and aritificial intelligence.

Unit 2 - Core Data Science

  1. Querying Data with Transact-SQL -> Basic aspects about querying and modifying data in SQL Server, Azure SQL with Transact-SQL. A first contact with MS Azure is done with the creation of a SQL database.

  2. Introduction to R for Data Science / Introduction to Python for Data Science -> First contact with the 2 main programming languages for data analysis. It is interesting the collaboration with the online learning platform Datacamp for the course implementation.

  3. Essential Statistics for Data Analysis using Excel / (Nuevo) Essential Math for Machine Learning: R Edition / (Nuevo) Essential Math for Machine Learning: Python Edition -> Solid understanding of descriptive statistics, basic probability, random variables, sampling and confidence intervals and hypothesis testing. Practical examples and their implementation in MS Excel (just like R and Python in the new course alternatives).

Unit 3 - Applied Data Science

  1. Data Science Research Methods: R Edition / Data Science Research Methods: Python Edition / (antes Data Science Essentials) -> Get hands-on experience with the previous data science concepts learned (statistic and data analysis) implemented in R or Python with practical examples and brief notes about machine learning.

  2. Principles of Machine Learning: R Edition / Principles of Machine Learning: Python Edition (antes Principles of Machine Learning) -> Get deeper knowledge, following up on the concepts of the previous course, about machine learning models (Classification, Regression and Clustering), and the model performance improvement and optimized models as neural networks or Support Vector Machines. This course is key for the challenge in the Project capstone.

  3. Developing Big Data Solutions with Azure Machine Learning / Analyzing Big Data with Microsoft R / Implementing Predictive Analytics with Spark in Azure HDInsight -> This final course follows different goals depending on the theme selected. In my case, the one chosen was the last one which learns how to use Spark Python in MS Azure HDInsight to create predictive analytics and machine learning solutions.

Course no longer in the Data Science track:

  • Programming with R for Data Science / Programming with Python for Data Science -> This course is no longer in the Data Science track at the expense of the new courses and seeing that their contents has been included in the specialized courses for any language (courses 7 mainly, 8 and 9). Anyway, this course is really teaching and synthesizes all the methodology of data analysis, learning R or Python with its main libraries and packages, from data collection, cleaning, transformation, creation of machine learning models, as well as their evaluation and data visualization.

Unit 4 – Project Capstone

The Project capstone consists making the analysis of a real dataset (in my case, about the poverty data in the United States from 33 different variables) and the construction of a predictive model about it. The project evaluation is based on 3 objectives:

  • Data exploration and analysis to solve several questions about it
  • Kaggle style challenge for the construction of a predictive model, whose score depends on its accuracy (while competing with the other students)
  • Development of a detailed final report about the data analysis process and construction of the model, which is evaluated by other students.

Is it worth it?

In my opinion, absolutely for those who are thinking of starting in the world of Data Science from scratch.

The approach followed in this program, allows you to learn key and essential concepts of data analysis, then go on to deepen more advanced knowledge until you get to the first steps with machine learning models. In addition, it is very interesting the application of tools such as Excel, Power BI, Azure and programming language learning such as Transact-SQL, R or Python, which are totally necessary in this field.

Finally, from a professional point of view and CV, the institution behind the course, its organization and the prestige of the tutors make this program a good curricular asset to demonstrate the acquisition of the proper data science skills for new job opportunities.

In summary, my advice to anyone wavering about whether to join this program is “Go ahead” and enroll now in the first of the courses (initially in audit mode -free- since it can be certified after completion). As you can see, this course is quite engaging and will serve as a starting point to embark the whole program.

Sources of Interest

No comment