Tools We Use
This posts lists the tools, frameworks, and environments we use in my research group, mainly for prospective students and those new to the group. Some of these are fairly common, while others are specific to our research projects.
The following are used in all our projects:
-
Operating system: Linux.
-
Programming languages: C++, Python, LaTeX, bash scripts. We are fairly conservative in our choice of programming languages; this is to ensure that the barrier to entry is low for our various research projects, and there is good tool support for the languages we use.
-
git: distributed version control system.
-
Bazel: build and test tool supporting multiple languages. There was definitely a steep learning curve and migration cost to moving to Bazel, but the ability to have a single build system across all our projects has been a net positive.
-
pytest and GoogleTest: Python and C++ testing frameworks.
-
Protocol buffers: language-neutral, platform-neutral extensible mechanism for serializing structured data. These are a great way of specifying not just the input and output of our various tools, but also simplify serializing intermediate state.
-
gRPC: modern open source high performance Remote Procedure Call (RPC) framework. We have found the combination of protobuf/gRPC allows us to build multi-language tools as well as reusable components.
-
Docker: containers, mostly used to ease reproducibility.
-
BenchExec: framework for reliable benchmarking and resource measurement.
-
Amazon Web Services (AWS): cloud computing service. After testing and profiling on local machines, we usually use AWS to get the final experimental data, especially when doing large-scale studies on hundreds of benchmarks.
The use of the following tools depends on the specific project:
-
LLVM: compiler infrastructure.
-
PyTorch: machine learning framework.
-
NumPy: package for scientific computing with Python.
-
Intel Threading Building Blocks (TBB): library that supports scalable parallel programming using standard ISO C++ code.
-
TCMalloc: fast, multi-threaded malloc implementation. TCMalloc is a must when doing parallel programming, for instance, using TBB.
-
Eigen: C++ template library for linear algebra.
-
LTTng and Babeltrace: tracing framework for Linux, and trace manipulation toolkit.
-
pandas: Python data analysis library.
-
Matplotlib: Python library for creating static, animated, and interactive visualizations.
-
Z3: theorem prover.
Tutorials for these various tools can be found at the links above. An example of a project that uses a lot of these tools is SyReNN.