Software Security via Program Analysis

Class: 16741 CS 6501 Section 008

Instructor:

Yonghwi Kwon (yongkwon@virginia.edu, http://yongkwon.info)

Time and Place:

Tu., Thurs., 9:30AM - 10:45AM @ Rice 340

Office Hours: Tu 10:45AM ~ 12:00PM @ Rice 505

We use a google class site for homework assignments and announcements.

Course Description:

Cyberattacks are becoming more and more sophisticated. State-funded attackers are spending tremendous time and effort to infiltrate organizations (e.g., enterprise and government agencies) leveraging stealthy and sophisticated attack mechanisms (e.g., zero-day exploits).

To fight back against those attackers, there are various advanced techniques proposed by researchers and industry. As attackers break into systems in various ways, building a fundamental protection against these attackers require techniques across various layers of software and fundamental understanding of the system as well as attackers.

  • This course will cover recent advances in cyberattack prevention and analysis via program analysis and reverse-engineering. In particular, we will focus on understanding recent advances in the topics by reading, presenting, and discussing details of recent publications in top security conferences (S&P, USENIX Security, CCS, and NDSS).
  • You will learn (1) how to analyze vulnerabilities and exploits to understand root causes of the attack, and come up with fundamental solutions, (2) how to investigate sophisticated cyberattacks in order to pinpoint and discover how attackers infiltrate systems and what they did (e.g., leak secrets), and (3) how we can leverage program analysis techniques in order to automate the above tasks and make the software more secure.
  • You will (1) read recent academic papers carefully and present the essence of papers, (2) learn how to implement advanced dynamic and static analysis used in attack prevention and investigation, and (3) learn how to conduct a system security research that builds fundamental software.
  • It would be great if you have basic knowledge or experience in system programming in C (assembly is a big plus). Experience in dynamic program analysis tools (e.g., Intel Pin, DynamoRIO), static program analysis tools (e.g., LLVM), and/or reverse-engineering tools (e.g., IDA Disassembler, OllyDbg, Immunity Debugger) is very welcomed.

Opportunities:

  • If your individual project is well developed, I will support you to turn it into a research paper. Once accepted (in a conference or workshop), expenses required for your travel to the conference (or workshop) and presentation will be supported.
  • If we both agreed that your research interests and potential are well aligned (with me), we may seek for potential funding sources for your Ph.D. program.

Information flow can be tracked via program analysis, meaning that we can understand how Siri and Alexa laughed.

Schedule

Week 1 (Aug 27, 29): Class Introduction / Project Descriptions / How to Read/Critique/Present System Security Papers

Week 2 (Sep 3, 5): Program Tracing / (Project 1 and Pin introduction)

What is tracing? Why it is needed? How to automatically trace a program?

  • Project 1 Start -- 10% of the grade (Will be release on Friday midnight)

Week 3 (Sep 10, 12): Dynamic Analysis (Dynamic Slicing / Information Flow)

What is dynamic analysis? What is slicing and information flow? Why we need those? How to do those effectively? What we can do with those?

  • Paper Assignment

Week 4/5/6 (Sep 17, 19, 24, 26, Oct 1, 3): Dynamic Analysis (Information Flow) / Reverse Engineering (Disassembly)

Topic 1: How we can leverage our knowledge of information flow to build secure systems/make systems secure?

Topic 2: What is reverse engineering? Basic principles of disassembly. Recovering semantics

Week 7/8 (Oct 1, 3, Oct 10): Reverse Engineering (Advanced Disassembly, Decompiler, Anti-Debugging Techniques) / (Project 2 and LLVM introduction)

Recovering semantics, Decompilers, Anti-debugging techniques and solutions against them

  • Project 2 Start -- 10% of the grade. (Will be release on Friday midnight)

Week 9 (Oct 15, 17): Reverse Engineering / (Project 2 and LLVM introduction)

Recovering semantics, Decompilers, Anti-debugging techniques and solutions against them

  • (Tentative) Project 1 Due (Sunday midnight)

Week 10 (Oct 22, 24): Reverse Engineering / Static Analysis

What is static analysis? How it is compared to dynamic analysis? What are the common static analysis techniques for security applications? => Answer: Value-set Analysis, Control-flow Integrity, Data-flow Integrity.

Week 11 (Oct 29, 31): Research proposal discussion (tentative) - Oct 29, Guest Lecture - Oct 31

Guest lecture about PKI security and certificates abuse in real-world.

  • Have open discussion sessions for the projects. Providing peer feedback.

Week 12 (Nov 5, 7): Cyber Forensics

TBD

  • Project 3 Start -- 10% of the grade

Week 13 (Nov 12, 14): No classes

Week 14 (Nov 19, 21): Script Language Security

Analyzing real world vulnerability in web browsers to understand and prevent attacks.

Week 15 (Nov 26, 28) : Thanksgiving -- No classes

Materials will be provided while we have no classes this week.

Understand how and why web is insecure. Learn various ways and methods to prevent attacks on web ecosystem across server and client.

  • (Tentative) Project 2 Due

Week 16 (Dec 3, 5): Final Presentations (Outcomes of your individual projects)

Individual students present final results of the course project (individual project) -- 40% of the grade. (breakdown: 10% for presentation, 10% for a report, 20% for the project artifacts)

  • (Tentative) Project 3 Due

Reading Days (Oct 8) , Thanksgiving (Nov 28), Final week (Dec 10, 12): No class

Assessment:

This class has no exam. The grading is based on projects and presentations.

1. Presentation: 20% (10% for understanding of the paper, 10% for effective presentation)

2. Assignments: 30% (3 assignments; each 10%)

3. Independent Research Project: 40% (20% for the design and implementation, 10% for a presentation, 10% for a report)

4. Class participation: 10% (Questions and Reviews for the papers discussed in the class)

I will hand out my business cards for students who participated in the class actively. At the end of the semester, return the cards to redeem your credits.

5. Extra credit: Extra assignments: TBD% (To be announced)

Topics:

Dynamic Program Analysis

  • Data-flow tracking
  • Control-flow tracking

Static Program Analysis

  • Data-flow analysis
  • Control-flow analysis
  • Pointer/alias analysis

Reverse-engineering

  • Evasive techniques
  • Code obfuscation/de-obfuscation

Operating System Security

  • Sandboxing/isolation, Fault localization
  • Record and replay based analysis
  • Audit-logging

Web Security

  • Script Language Security (JS/Flash)
  • Browser Security
  • Malicious Advertisement

Mobile Security

  • Security Issues on Android/iOS
  • Program Analysis techniques for Mobile Platforms

IoT Security

  • Security Issues on heterogeneous IoT platforms
  • Improving IoT security via program analysis

Reading List:

This reading list includes representative publications that will be covered during this class. Papers will be added during the semester. Please use them to understand high-level themes of the class topics.

Particularly for systems security papers: (1) Read Abstract -> Introduction -> Conclusion, (2) Find and read a motivation (representative) example or case studies. They include a complete (and often realistic) story and how the proposed idea solves the problem with newly proposed methods.

Dynamic/Static Analysis Frameworks

Data-flow tracking and Data-flow analysis

Control-flow tracking and Control-flow analysis

Evasive techniques

Code obfuscation/de-obfuscation

Record and replay / N-version systems

Audit-logging

Web/Browser Security

Sandboxing/isolation, Fault localization

Mobile/IoT Security

Machine Learning (Added)