Topics in Automated Reasoning and Program Analysis

ECS 289C (Programming Languages and Compilers)

Fall 2017

Instructor: Aditya Thakur (avthakur@ucdavis.edu)

Lecture Time and Location: TR 4:40 - 6:00 pm, Cruess Hall 107

Credits: 4

Office Hours Time and Location: Friday 1:30 - 2:30 pm, Academic Surge Building 2073

Syllabus

Canvas

Course overview:

This course will cover the theory and practice of automated reasoning and program analysis, with a focus on connections between these research areas. Topics in automated reasoning may include: propositional satisfiability, theorem provers, separation logic, and model checking. Topics in program analysis may include: static program analysis (abstract interpretation), dynamic program analysis, automated testing, and symbolic execution.

The course will draw primarily on research readings, and it will contain a mix of lectures and student paper presentations; evaluation will be based on these presentations as well as a course project. This course will provide a reasonable overview of the state-of-the-art in automated reasoning and program analysis; thus, this course would benefit graduate students interested in pursuing research in these areas. Moreover, the course project can be tailored to an area of personal interest that might benefit from automated reasoning and program analysis techniques.

Course schedule

Date	Presenter	Topic
9/28	Aditya Thakur	Course Overview Slides
10/3	Aditya Thakur	Introduction to SAT pptx
10/4		Email paper presentation preferences by 6 pm
10/5	Aditya Thakur	SAT continued pptx, Introduction to SMT pptx
10/9		Project pre-proposal due at 6 pm
10/10	———————————	Engineering SAT solvers
	Wen-Chi Hung	Chaff:Engineering an Efficient SAT Solver, DAC 2001
	Anthony Di Franco	Extending SAT Solvers to Cryptographic Problems, SAT 2009
10/12	———————————	Thinking outside the DPLL-CDCL box
	Aditya Thakur	A Tutorial on Staalmarck’s Proof Procedure for Propositional Logic, FMCAD 1998 pptx
	Shaikh Mohammed Ismail	Local search strategies for satisfiability testing, 1993
10/17	———————————	Going from SAT to SMT
	Joseph McGee	DPLL(T): Fast Decision Procedures, CAV 2004
	Giap D. Le	A fast linear-arithmetic solver for DPLL(T), CAV 2006
10/19	———————————	Many cores and many theories
	Bryan Ching	ManySAT: a parallel SAT solver, JSAT 2009
	Sai Kopparthi	Simplification by cooperating decision procedures, TOPLAS 1979
10/20		Project proposal due; submit via Canvas
10/24	———————————	Reasoning about graphs and networks
	Gang Chen	SAT Modulo Monotonic Theories, AAAI 2015
	Jonathan Castello	Header Space Analysis: Static Checking For Networks, NSDI 2012
10/26	———————————	SMT meets neural nets
	Kathy Zhang	Safety verification of deep neural networks, CAV 2017
	Yiran Wang	Reluplex: An Efficient SMT Solver for Verifying Deep Neural Networks, CAV 2017
		Automated reasoning ends. Program analysis begins.
		Midpoint review
10/31	———————————	Automated testing
	Sai Kopparthi	Directed Greybox Fuzzing, CCS 2017
	Giap D. Le	KLEE: Unassisted and Automatic Generation of High-Coverage Tests for Complex Systems Programs, OSDI 2008
11/2	———————————	Static analysis of programs
	Aditya Thakur	Abstract Interpretation Primer
11/7	———————————	Learning from program executions
	Shaikh Mohammed Ismail	Dynamically Discovering Likely Program Invariants to Support Program Evolution, TSE 2001
	Jonathan Castello	Instrumentation and Sampling Strategies for Cooperative Concurrency Bug Isolation, OOPSLA 2010
11/9	———————————	Reachability
	Aditya Thakur	Precise Interprocedural Dataflow Analysis via Graph Reachability, POPL 1995 pptx
	Bryan Ching	TaintDroid: An Information-Flow Tracking System for Realtime Privacy Monitoring on Smartphones, OSDI 2010
11/13		Project check-in this week
11/14	———————————	Reasoning about pointers
	Joseph McGee	New Results on the Computability and Complexity of Points-to Analysis, POPL 2003
	Anthony Di Franco	Graspan: A Single-machine Disk-based Graph System for Interprocedural Static Analyses of Large-scale Systems Code, ASPLOS 2017
11/16	———————————	From theory to practice
	Dr. Eddie Aftandilian	Static Analysis at Google
11/21	———————————	Numeric invariants and the heap
	Kathy Zhang	The octagon abstract domain, AST 2001
	Yiran Wang	Parametric shape analysis via 3-valued logic, POPL 1999
11/28	———————————	Decision Procedures and Abstract interpretation: Better Together!
	Aditya Thakur	Symbolic Abstraction pptx
11/30	———————————	Threads
	Wen-Chi Hung	KISS: Keep It Simple and Sequential, PLDI 2004
	Aditya Thakur	Class Overview Discussion pptx
12/5	———————————
	Dr. Tristan Ravitch	Informal Methods for Large Code, Formal Methods for Small Code
12/7	———————————
	Prof. Cindy Rubio	Dynamic Analyses for Floating-Point Precision Tuning
	———————————
12/11	10:30 am - 2 pm	Final Project Presentations

Invited Lectures

Static Analysis At Google

Software bugs cause critical problems for software companies and their users. For example, a bug in Apple’s SSL implementation (“goto fail”) caused it to accept invalid SSL certificates. Heartbleed allowed attackers to read the memory contents of any server that used OpenSSL. A bug related to date formatting caused a large-scale Twitter outage. Many of these bugs are statically detectable, and in fact are obvious with a close reading of the code, yet they make it into production software anyway.

Previous work has reported on reasons why engineers don’t use static analysis tools or why they ignore their warnings. These reasons include: (1) Tools are not integrated into the workflow, (2) warnings are not actionable, (2) engineers do not trust the tool, (3) “survivor bias” means the bug does not manifest in practice, (4) fixing the bug is too expensive or risky, and (5) the tool’s explanation doesn’t adequately explain the problem.

In this talk, I will describe how we have applied these lessons, as well as lessons from Google’s previous experience with the FindBugs static analysis tool, to build a successful static analysis infrastructure that is used daily by the almost all engineers at Google. Our tools scale to tens of thousands of engineers and hundreds of millions of lines of code, and detect over 2500 bugs per day, before the problematic code is checked into the codebase. The bugs our tools detect are fixed by engineers, by their own choice, rather than because of a mandate from managers, and we solicit and address feedback to ensure that our tools are providing value to users.

About the Speaker

Eddie Aftandilian is a Senior Software Engineer at Google, where he leads the Java Source Tooling, Compilation and Analysis team. He completed his PhD at Tufts University, working with Sam Guyer to help Java developers better understand the memory behavior of their programs. His team helps Google’s developers write better Java code – more correct, more readable, and with better performance. Their static analysis tool, Error Prone, is integrated into Google’s build system and code review tool and catches hundreds of bugs each day.

Informal Methods for Large Code, Formal Methods for Small Code

Galois uses a wide variety of program analysis techniques across our projects. In this talk, I will describe some work we have done at opposite ends of the spectrum of automated program reasoning. First, I will describe work we have done on information flow analysis in Android applications with the aim of identifying colluding applications. This was a large scale analysis necessitating trade-offs and careful scoping. Second, I will describe work we have done to formally verify parts of s2n, which is an open source implementation of TLS. This work was much more focused on smaller pieces of code, but provides significantly stronger guarantees.

About the Speaker

Dr. Tristan Ravitch is a research lead at Galois, which is a company dedicated to building trustworthy systems. He is currently working on binary analysis and verification projects targeting embedded and control systems. He has also worked on static analysis of Android applications and privacy-preserving computation. Dr. Ravitch completed his Ph.D. at the University of Wisconsin in 2013.