Thursday 3 March 2016

Data Science with Python, Anaconda, Panda and Bokeh - Curriculum

Data Science with Python

Course Description


Python is a general-purpose programming language that is becoming more and more popular to do data science. Companies world-wide are using Python to harvest insights from their data and get a competitve edge. Unlike other Python tutorials, this course focuses on Python specifically for data science. You will learn about powerful ways to store and manipulate data as well as cool data science tools to start your own analyses.

What are the pre-requisites?


There are no pre-requisites. No prior knowledge of Statistics, Python programming or analytic techniques is required.

Duration


22 to 25 Hours

Introduction to Data Science


·      Introduction to Data, Tables, Database, ETL, EDW
·      What is Data Science?
·      Popular Tools
·      Role of Data Scientist
·      Analytics Methodology

Python - Getting Started

·      Installing Python on Windows
·      Installing Python on Mac and Linux
·      Introduction to Editors
·      Installing PyCharm and Sublime Editors

Python Basics

·      Numbers and Math in Python
·      Variable and Inputs
·      Built in Modules and Functions
·      Save and Run Python Files
·      Strings
·      Python List
·      Python slices and slicing

Python Statements

·      IF Else statements
·      Else If and Nested If Statements
·      Loops

Python Object Oriented Programming

·      Defining Functions
·      Default Parameters and Multiple Arguments
·      Class and Self
·      Class Constructors and Destructors
·      Sub Class, Super Class and Inheritance

Introduction to Data Visualization

·      Introduction to Data Science and Visualization Tools in Python
·      Installing and Setting up iPython Notebook
·      Installing Anaconda and Panda
·      Setting Up Environment

 Learning Numpy

·      Creating Arrays
·      Using Arrays and Scalars
·      Indexing Arrays
·      Array Transposition
·      Universal Array Function
·      Array Processing
·      Array Input and Ouput

Working with Panda

·      Series
·      Data Frames
·      Index Objects
·      Reindex
·      Drop Entry
·      Selecting Entries
·      Data Alignment
·      Rank and Sort
·      Summary Statistics
·      Missing Data
·      Index Hierarchy

Working with Data Part1

·      Reading and Writing Text Files
·      Json with Python
·      HTML with Python
·      Microsoft Excel Files with Python

Working with Data Part2

·      Merge, Merge on Index and Concatenate
·      Combining Data Frames
·      Reshaping and Pivoting
·      Duplicating Data Frames
·      Mapping, Replacing, Rename Index and Binning
·      Outliers and Permutations

Working with Data Part3

·      Group by on Data Frames
·      Group by on Dist Series
·      Aggregation
·      Splitting, Applying and Combining
·      Cross Tabulation









Working with Visualization


·      Installing Seaborn
·      Histograms
·      Kernel Density and Estimate Plots
·      Combining Plot Styles
·      Box and Violin Plots
·      Regression Plots
·      Heat Maps and Clustered Matrices
·      Example Projects -15

Machine Learning Language


·      Introduction
·      Linear Regression
·      Logistic Regression
·      Multi Class Classification – Logistic Regression
·      Multi Class Classification – Nearest Neighbor
·      Vector Machines
·      Naïve Bayes Theory



No comments:

Post a Comment