Team Members: Tanisha Mehta (50%), May Paek (50%)
We implemented and benchmarked multiple parallel algorithms for solving tridiagonal systems of equations, targeting both multi-core CPUs (MPI and OpenMP) and NVIDIA GPUs (CUDA). Our deliverables include an MPI-based Brugnano block-SPIKE solver, an MPI-based recursive doubling solver, and an MPI-based differential equation solver, as well as OpenMP versions of the Brugnano and recursive doubling algorithms. On the GPU side, we developed a Cyclic Reduction (CR) and a Parallel Cyclic Reduction (PCR), plus a PCR/CR hybrid solver using CUDA. We also implemented a sequential Thomas algorithm as a baseline for correctness and performance comparison. We will present performance, scalability, and trade-off analyses between MPI, OpenMP, and CUDA implementations, highlighting communication vs. computation balance. All implementations were tested on multi-core CPU machines and NVIDIA GPUs (RTX 3080 Ti), and we plan to showcase our results and insights at the parallelism competition.
| Week / Date Range | Tasks |
|---|---|
| Week 1 (Mar 25–31) | Finalize project plan, set up GitHub, start sequential Thomas implementation |
| Week 2 (Apr 1–7) | Finish and test sequential solver; start Paramveer MPI implementation |
| Week 3 (Apr 8–14) | Complete MPI implementation; finish and complete parallel full-recursive-doubling factorization; begin differential equation implementation; benchmark against sequential |
| Week 4 (Apr 15–18) |
|
| Week 4 (Apr 19–21) |
|
| Week 5 (Apr 22–24) |
|
| Week 5 (Apr 25–28) |
|