The Data Science Foundation capsule has been carefully prepared and curated by Project MyNT with an aim to introduce the exciting and rapidly changing field of data science. The capsule will cover -
Wrangle – You will learn to transform your data into a suitable form, for analysis.
Program – The capsule will teach you how to leverage powerful programming paradigms to solve data problems with ease and clarity.
Explore – You will learn to quickly examine data, build hypotheses and test them.
Communicate – You will learn the tools for integrating prose, code and results.
By the end of the capsule, you will be able to build a data-driven web application to convey your analytical findings.
The capsule is curated for participants who wish to learn the building blocks of data science in a fast, fluent and fun manner.
A brief look at the data science landscape is what we will start with. We shall have a look at the different disciplines, tools and technologies which come under the ambit of data science.
Since this is a hands-on interactive capsule, we shall also setup the environment on your laptops to work in. This will comprise of downloading and installing the R statistical programming language, RStudio integrated development environment, Git for version control and the requisite libraries and packages that would be used during the capsule.
“There is no such thing as information overload. There is only bad design.” – Edward Tufte
The power of visualization is what we will learn today. R has several systems for making graphs and plots, but ggplot2 is perhaps the most elegant and versatile. Based on the layered grammar of graphics, this package allows us to visualize data in most ways conceivable.
We will take a dataset and ask questions about it. We shall then learn how to use the ggplot2 library to answer these questions graphically.
While ggplot2 is a grammar of graphics, dplyr is the grammar of data manipulation. In this class we will learn the powerful ‘verbs’ of the dplyr package. Very rarely will we get the data clean and fit the analysis. That’s where dplyr helps. We shall work with the flights dataset and learn to create new variables and many more transformations to make our data more pliable for analysis.
Armed with the knowledge of visualization and data transformation. Day 4 will be spent in Exploratory Data Analysis or EDA. This is where you will don the Sherlock hat and ask questions and test hypotheses about your data.
The last two days is when you will learn to present your data and analysis to a wider audience. In these two days, we will be building a full-fledged web application that will be able to extract data using APIs.
We will use the Shiny package in R to build a web application which will also show us network analysis between various characters and Marvel comics.
Yes, you need to carry your Wi-Fi enabled laptop. The system could be running Windows, iOS or Linux OS.
No. The curriculum is designed to get you doing data science as quickly as possible with no assumptions of any previous programming experience.
|1||Academic Fees||Rs. 5500/-|
|2||Other Expenses (Joining Kit, Portal and Certification)||Rs. 2000/-|
|Total Payable||Rs. 7500/-|