Member: IEEE, IEEE Computer Society, IEEE Signal Processing Society
Computer Science PhD student in the Advanced Processor Technologies (APT) group at the University of Manchester specializing in High Performance Computing (HPC). My research is focused around compiler and runtime optimizations, and profiling for explicit parallel programs. List of my publications can be found on my Google Scholar profile and some of my contributions in the pepperpots organization on GitHub. I have previous industrial experience in vectorizing and parallelizing applications especially with focus on real time applications and signal processing. It includes writing modern C++ code with use of Advanced Vector Extension (AVX) intrinsics and CUDA.
Marking assignments and assisting academic staff with running undergraduate laboratories in the Department of Computer Science. Teaching assistant for following modules: System-on-Chip Design, Processor Microarchitecture, Operating Systems and Microcontrollers
Summer intern working on lossless image compression for self-driving cars. Developed a highly optimized CPU and GPU encoder and decoder based on the JPEG standard that achieve a real-time performance reducing the size of data saved to the storage device.
Implemented and optimized physical layer algorithms for LTE and 5G NR networks on Intel Architecture using the AVX instruction set in C++. Designed and implemented test framework and Python automation for the project that sped up the delivery process. Developed Matlab simulation of physical layer functions.
Vacation position at the School of Computer Science. Developed, tested and documented drivers for school's Spartan-6 FPGA experimental boards used for teaching. Got exposed to design and verification of sequential systems and Cadence software. Developed drivers for HD44780 LCD display, LED matrix and I2C devices. Soldered boards and created technical documentation.
Postgraduate research student in the Advanced Processor Technologies (APT) Research Group in the Department of Computer Science supervised by Dr Antoniu Pop. Research focus on compiler and runtime optimizations, and profiling for explicit parallel programs.
First Class (86%)
Third year project: "Low-Precision Neural Network Decoding of Polar Codes"
Elected Student Representative at Student Staff Committee in years 2015/16, 2016/17 and 2018/19. Study focused on on mobile systems and networks, computer architecture and microarchitecture, System-on-Chip design and algorithms.
Studied wide range of subjects such as Mathematics, Physics, Computer Science, Chemistry, Biology, Geography, History, Social Sciences and languages. Mainly focused on Mathematics, Physic and Computer Science. Represented school in regional competitions.
Co-organising Peer Support sessions for around 250 2nd year Computer Science students. Planning weekly activities to help students with their studies and finding an employment. Engaging with school staff to get required support and improve students’ experience.
Co-organized four events in Manchester with total of 1000 participants. Managed finances, logistics and relations with external companies. Raised sponsorships and supervised multiple teams working on the event.
Neural Network (NN) polar decoders have been getting much attention as a viable replacement for conventional decoders in 5G New Radio (NR). Despite scalability issues, the NN-based decoder is a promising technology as it can improve the latency of the standard Successive Cancellation (SC) decoder. It was shown that the Neural Successive Cancellation (NSC) decoder has an improved theoretical latency compared to the standard SC decoder. However, in contrast to SC, the NSC decoder uses large floating-point weight matrices which do not fit in CPU caches, leading to higher energy usage and lower computational performance due to the increased memory traffic. Additionally such higher memory requirement would be expensive to implement in hardware and require complex floating-point arithmetic. This paper presents a new low-precision NN decoder that can replace memory-heavy NN decoders inside the NSC decoder. We show that up to 54 times weights’ size reduction can be achieved with the wireless performance degradation varying between 0.1dB and 0.4dB compared to the floating-point implementation. Moreover, we show a reduction of up to respectively 438× and 555× in L1 and L2 data cache misses in our prototype software implementation.
© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.