Lifelong Learning for a Brighter World

female data analyst sitting in front of her computer in a modern office charting data female data analyst sitting in front of her computer in a modern office charting data

Foundations of Analytics

Business Intelligence, Data Analysis and Data Science

Your Introduction to the World of Analytics

OSI 102 - Python for Basic Collection C01

Academic Credit Value:
None
Course Delivery Mode:
Virtual Classroom
Hours of Study:
30 hours
Course Prerequisite(s):
Recommended: Intermediate skills with MS Excel OSI 101 Tradecraft and Operations
Course Anti-requisite(s):
n/a
Instructor Name:
Sami Khoury
Course Dates:
05/10/2020 - 07/05/2020



Required Course Materials:
Participants are to bring a laptop to each class. Resources/materials will be provided during class.
Optional Course Materials:
Course Description:

Participants are advised to retain course outlines for future use in support of applications for employment.
Refer to the Policy & Procedure section for further course and Continuing Education (CE) information.

In this course, participants will learn how to treat the Internet as a data repository and how to create automated data collection mechanisms. The course starts with teaching the absolute essentials of the Python programming language, and then walks participants through the practical steps of constructing their very first functional web scraper for a basic HTML site.

Learning Outcomes:

By the end of this course, participants will be able to:  

Create simple functional programs in the popular and highly versatile Python language.
Examine and understand the source code of a web page and locate the desired content.
Build a basic web scraper from scratch and save the scraped content in a structured format for further analysis.

Course Evaluation

Participants are assigned a Pass/Fail notation based on the following course activities:

  • In-class exercises
  • Capstone exercise
Course Format:

This course is designed to present the fundamental concepts and theories in open source intelligence and promote the application to the workplace and professional practice. Course activities will include instructor presentations and experiential learning activities. 

Assignment Submission:
Course assignments are submitted in class on the specified due date
Late Coursework:
All coursework must be submitted by the last scheduled date of the class.  Requests for extensions must be submitted to the Instructor before the assignment due date.

Policy & Procedures:

Academic Regulations (Attendance, Coursework, Tests/Exams):
In accordance with McMaster University’s General Academic Regulations, “it is imperative that students
make every effort to meet the originally scheduled course requirements and it is a student’s
responsibility to write examinations as scheduled.” Therefore, all students are expected to attend and
complete the specific course requirements (i.e. attendance, assignments, and tests/exams) listed in the
course outline on or by the date specified. Students who need to arrange for coursework
accommodation, as a result of medical, personal or family reasons, must contact the course instructor
within 48 hours of the originally scheduled due date. It is the student’s responsibility to contact the
Program Manager/Program Associate to discuss accommodations and procedures related to deferred
tests and/or examinations within 48 hours of the originally scheduled test/exam, as per policy. Failure
to contact the course instructor, in the case of missed coursework, or the Program Manager/Program
Associate, in the case of a missed test/examination, within the specified 48 hour window will result in a
grade of zero (0) on the coursework/exam and no further consideration will be granted.

*Note: Supporting documentation will be required but will not ensure approval of accommodation(s).
Academic Integrity

You are expected to exhibit honesty and use ethical behaviour in all aspects of the learning process. Academic credentials you earn are rooted in principles of honesty and academic integrity. Academic dishonesty is to knowingly act or fail to act in a way that results or could result in unearned academic credit or advantage. This behaviour can result in serious consequences, e.g. the grade of zero on an assignment, loss of credit with a notation on the transcript (notation reads: “Grade of F assigned for academic dishonesty”), and/or suspension or expulsion from the university.

 

It is your responsibility to understand what constitutes academic dishonesty. For information on the various types of academic dishonesty please refer to the Academic Integrity Policy, located at http://www.mcmaster.ca/academicintegrity/

 

The following illustrates only three forms of academic dishonesty:

  1. Plagiarism, e.g. the submission of work that is not one’s own or for which other credit has been obtained.
  2. Improper collaboration in-group work.
  3. Copying or using unauthorized aids in tests and examinations.
Academic Accommodations:

Participants with disabilities who require academic accommodations must contact the Student Accessibility Centre (SAS) to meet with an appropriate Disability Services Coordinator. To contact SAS, phone 905-525-9140 ext. 28652, or email sas@mcmaster.ca. For further information, consult McMaster University’s Policy for Academic Accommodation for Students with Disabilities.

 

On-line Elements:
In this course, we will be using on-line elements, which may include email, Avenue to Learn, WebEX, and external web sites.  Participants should be aware that, when they access the electronic components of this course, private information such as first and last names, user names for the McMaster e-mail accounts, and program affiliation may become apparent to all other participants in the same course. The available information is dependent on the technology used. Continuation in this course will be deemed consent to this disclosure. If you have any questions or concerns about such disclosure please discuss this with the instructor. 
Turnitin.com:
Course Changes:
The instructor reserves the right to modify elements of the course and will notify participants accordingly.
Course Withdrawal Policy:

Policies related to dropping a course and course withdrawals are posted to Continuing Education’s program webpage, FAQs & Policies (https://www.mcmastercce.ca/cce-policies#Dropping).

Storm Closure Policy:
 In the event of inclement weather, the Centre for Continuing Education will abide by the University’s Storm Closure Policy: https://www.mcmaster.ca/policy/Employee/storm_emergency_policy.pdf, and will only close if the University is closed. All in-class courses, exams and room bookings by internal and external clients will be cancelled if the Centre for Continuing Education is closed. On-line courses will take place as scheduled
Grading Scale:
Course Schedule:
 

Session

Topic & Materials

1

Overview and Set Up

· Introduction

· The data driven approach

· The Internet as a source

· Layers of data

· Python and Anaconda

· Environment set up

· Packages and modules

 

2-4

Learning to Code in Python

· Variables, operators, expressions

· Built-in data structures

· Conditional statements and loops

· Nested elements

· Error handling

 

5-6

Basic Web Scraping

 

· Source code, HTML/XML, XPath

· Fetching and parsing web data

· Pandas DataFrames, export output data

· Encoding

· Debugging

· Ethics and legality

· Build a basic web scraper (exercise set up)

 

7-8

Capstone Exercise

· Build a basic web scraper

· Testing, debugging, and refining your code

· Next steps