This site contains documentation for v0.2 of the Intrepydd programming system being developed in the DDARING research project with contributors from Georgia Tech, UIUC, U.Michigan, and USC. Intrepydd is a Python-based domain-specific programming language for data analytics kernels, suitable for ahead-of-time compilation on current and future hardware platforms and accelerators. Intrepydd is not intended for writing complete/main programs. This release (v0.2) contains an implementation of an early version of the Intrepydd language. We welcome feedback from testers of this release; your feedback will influence future enhancements to the Intrepydd language, its implementation, and its documentation.
View the Project on GitHub hpcgarage/intrepyddguide
Recommended steps in using the Intrepydd v0.2 release
We assume that you have access to an Intrepydd release with the pyddc
command, along with a standard Python environment.
The recommended use of the Intrepydd v0.2 release in implementing a
data analytics workflow is as follows:
- Create a pure Python implementation of the workflow, using
standard libraries such as NumPy, CombBLAS, and PyTorch.
- Use a standard Python profiler to identify the performance-critical code regions of the Python implementation.
- Select a performance-critical code region in the Python code that is a promising
candidate to convert to Intrepydd code. (If the Python code uses
a library that is not supported by
Intrepydd, then you will need to implement the functionality of
that library yourself, e.g. by using for/pfor loops.)
- Insert calls to evaluate the Energy-Delay-Squared
goal metric, in
Joules-Seconds^2, for the core computation in the pure Python
implementation (ignoring initialization, data input, and data output).
- Move the performance-critical function to a new function in a single
Intrepydd file for the workflow (.pydd
extension).
- Add type declarations for function parameters and return values to
the new Intrepydd function;
sometimes, additional type declarations may be needed for some
of the internal assignment statements in the function.
- Compile the .pydd file with the pyddc command to automatically
generate C/C++ code from the .pydd file.
- Execute the new Python main program with calls to the optimized
Intrepydd code, and record its new Energy-Delay-Squared goal metric (in
Joules-Seconds^2).
- Try improving the performance of the code in the .pydd file by
algorithmic improvements, parallelization and locality
optimizations suggested in the page on Code optimization techniques.
- Repeat steps 7-9 for the current performance-critical code region.
- Repeat steps 2-10 for additional performance-critical code regions.