Lifelong Learning for a Brighter World

business man sitting at laptop business man sitting at laptop

Open Source Intelligence

Protect your privacy and become data-minded

Harness the power of information to gain a competitive advantage

OSI 104 - Data Mining and Analysis Essentials C01

Academic Credit Value:
Course Delivery Mode:
Hours of Study:
30 hours
Course Prerequisite(s):
Suggested: Basic-Intermediate skills with MS Excel Intermediate skills with Python OSI 102 Python for Basic Collection OSI 103 Python for Advanced Collection
Course Anti-requisite(s):
Instructor Name:
Course Dates:
03/30/2020 - 04/03/2020

Required Course Materials:
Participants are to bring a laptop to each class. Resources/materials will be provided during class.
Optional Course Materials:
Course Description:

Participants are advised to retain course outlines for future use in support of applications for employment.
Refer to the Policy & Procedure section for further course and Continuing Education (CE) information.

In this course, participants will learn how to condition and prepare their previously collected data for further analysis, and how to construct analytical and visualization algorithms to extract meaning and intelligence insights. This course also teaches participants how to quantify and analyze linguistic information using natural language processing methods, and how to train and use machine learning algorithms to categorize new information or to generate forecasts based on historical data.

Learning Outcomes:

By the end of this course, participants will be able to:

·         Aggregate and prepare collected data for analysis.

·         Explore, describe and make sense of curated datasets.

·         Glean insights from text using Natural Language Processing.

·         Build models for classification and prediction based on existing data.

Course Evaluation

Participants are assigned a Pass/Fail notation based on the following course activities:

·         In-class exercises

·         Capstone exercise

Course Format:
This course is designed to present the fundamental concepts and theories in open source intelligence and promote the application to the workplace and professional practice. Course activities will include instructor presentations and experiential learning activities.  
Assignment Submission:
Course assignments are submitted in class on the specified due date
Late Coursework:

ll coursework must be submitted by the last scheduled date of the class.  Requests for extensions must be submitted to the Instructor before the assignment due date.

Policy & Procedures:

Academic Regulations (Attendance, Coursework, Tests/Exams):
In accordance to McMaster University’s General Academic Regulations, “it is imperative that students make every effort to meet the originally scheduled course requirements and it is a student’s responsibility to write examinations as scheduled.” Therefore, all students are expected to attend and complete the specific course requirements (i.e. attendance, assignments, and tests/exams) listed in the course outline on or by the date specified. Students who need to arrange for coursework accommodation, as a result of medical, personal or family reasons, must contact the course instructor within 48 hours of the originally scheduled due date. It is the student’s responsibility to contact the Program Manager to discuss accommodations and procedures related to deferred tests and/or examinations within 48 hours of the originally scheduled test/exam, as per policy. Failure to contact the course instructor, in the case of missed coursework, or the Program Manager, in the case of a missed test/examination, within the specified 48-hour window will result in a grade of zero (0) on the coursework/exam and no further consideration will be granted.

*Note: Supporting documentation will be required but will not ensure approval of accommodation(s).
Academic Integrity

You are expected to exhibit honesty and use ethical behaviour in all aspects of the learning process. Academic credentials you earn are rooted in principles of honesty and academic integrity. Academic dishonesty is to knowingly act or fail to act in a way that results or could result in unearned academic credit or advantage. This behaviour can result in serious consequences, e.g. the grade of zero on an assignment, loss of credit with a notation on the transcript (notation reads: “Grade of F assigned for academic dishonesty”), and/or suspension or expulsion from the university.


It is your responsibility to understand what constitutes academic dishonesty. For information on the various types of academic dishonesty please refer to the Academic Integrity Policy, located at


The following illustrates only three forms of academic dishonesty:

  1. Plagiarism, e.g. the submission of work that is not one’s own or for which other credit has been obtained.
  2. Improper collaboration in-group work.
  3. Copying or using unauthorized aids in tests and examinations.
Academic Accommodations:

Participants with disabilities who require academic accommodations must contact the Student Accessibility Centre (SAS) to meet with an appropriate Disability Services Coordinator. To contact SAS, phone 905-525-9140 ext. 28652, or email For further information, consult McMaster University’s Policy for Academic Accommodation for Students with Disabilities.

On-line Elements:

In this course, we will be using on-line elements, which may include email, Avenue to Learn, WebEX, and external web sites.  Participants should be aware that, when they access the electronic components of this course, private information such as first and last names, user names for the McMaster e-mail accounts, and program affiliation may become apparent to all other participants in the same course. The available information is dependent on the technology used. Continuation in this course will be deemed consent to this disclosure. If you have any questions or concerns about such disclosure please discuss this with the instructor.
Course Changes:
The instructor reserves the right to modify elements of the course and will notify participants accordingly.
Course Withdrawal Policy:

Policies related to dropping a course and course withdrawals are posted to Continuing Education’s program webpage, FAQs & Policies (

Storm Closure Policy:
In the event of inclement weather, the Centre for Continuing Education will abide by the University’s Storm Closure Policy:, and will only close if the University is closed. All in-class courses, exams and room bookings by internal and external clients will be cancelled if the Centre for Continuing Education is closed. On-line courses will take place as scheduled.
Grading Scale:
Course Schedule:


Topic & Materials


Overview and Set Up

· Course introduction

· Data Science Libraries and Resources

· Environment Set Up

· Required Packages and Modules



· Working with DataFrames

· Exploratory Data Analytics

· Analytics and Visualization


Natural Language Processing

· NLP Overview

· Preprocessing

· Auto-translation

· Word Frequency

· Document Frequency

· Topic Modeling

· Sentiment Analysis


Machine Learning

· Machine Learning Overview

· Preprocessing

· Classification/Regression Problems

· Supervised/Unsupervised Learning

· Fitting and Predicting

· Testing and Validation


Capstone Exercise

· Visualize and analyze your collected data for insights