March 20-22, 2017
COURSE DESCRIPTION
This interactive course will teach participants and security professionals how to apply data science techniques to quickly write scripts to manipulate and analyze network and security data and ultimately uncover valuable insights from security data. The course will introduce participants to the entire data science process from data preparation, exploratory data analysis, data visualization, machine learning, model evaluation and finally, implementing at scale—all with a focus on security related problems.
Participants will learn how to read in data in a variety of common formats then write scripts to analyze and visualize that data. The course will cover:
• How to write scripts to efficiently read and manipulate CSV, XML, and JSON files
• How to quickly and efficiently parse executables, log files, pcap and extract artifacts from them
• How to make API calls to merge datasets
• How to use the Pandas library to quickly manipulate tabular data
• How to effectively visualize data using Python
• How to preprocess raw security data for machine learning and feature engineering
• How to build, apply and evaluate machine learning algorithms to identify potential threats
• How to use machine learning to identify anomalous network behavior and recognize potential network threats.
Finally, we will introduce the participants to cutting edge Big Data tools including Apache Spark (PySpark), Apache Drill, and demonstrate how to apply these techniques to extremely large datasets.
**All lectures will be held in English
PREREQUISITES
Participants will need to have a basic understanding of Python.
APPLY TO PARTICIPATE
Participation in the training session is free of charge, and is sponsored by the BIU Center in Applied Cryptography and Cyber Security.
The number of available spaces is limited, and in order to participate, you must fill out an application consisting of your CV and grades transcript. All accepted participants must commit to attending the entire training session. The deadline for application is March 5, 2017.
Please fill out the application form here
WHAT PARTICIPANTS SHOULD BRING
Participants must bring a laptop with either Virtualbox (or VMWare) installed, 8GB of RAM and 10GB of storage.
WHAT PARTICIPANTS WILL BE PROVIDED WITH
A preconfigured virtual machine (VM) containing all the software needed for the class. The VM will also contain:
• All course slides, notebooks, reference sheets and handouts. documentation
• Skeleton code examples for in-class exercises
TRAINER BIOGRAPHY
Mr. Charles Givre has always been interested solving problems in unique ways, and has worked to make a career of it as a data scientist at Booz Allen Hamilton. At Booz Allen, Mr. Givre worked as a technical leader on various large government projects. Mr. Givre enjoys sharing his passion for data science with others and has worked to develop comprehensive data science training programs at his firm. Prior to joining Booz Allen, Mr. Givre worked as a counterterrorism analyst at the Central Intelligence Agency for nearly five years.
Mr. Givre got interested in Apache Drill several years ago, and is co-author of the first O’Reilly book about Drill. He has delivered numerous workshops about Drill and has contributed to the codebase. Mr. Givre is a sought-after speaker and has delivered training and talks at international conferences such as BlackHat, Strata +Hadoop World, Open Data Science Conference (ODSC) and others. Mr. Givre holds a Master of Arts from Brandeis University in Middle Eastern Studies, a Bachelor of Science in Computer Science and a Bachelor of Music both from the University of Arizona. Mr. Givre also holds a CISSP, Security+ and various other certifications. Mr. Givre blogs at thedataist.com and in his non-existent spare time, Mr. Givre enjoys spending time with his family and restoring classic cars.