Arifuzzaman Joy | AI/ML Engineer & MLOps Specialist

Case Studies

Selected Projects

Production-grade AI systems with proven performance improvements and cost savings

Featured in Gen-Verse Repository arXiv:2511.20639 HuggingFace #1 Paper of Day

LatentMAS-SLoRA: Multi-Agent Reasoning with S-LoRA

Multi-agent framework enabling collaboration in latent space rather than token space, augmented with role-specialized LoRA adapters. Featured as 1 of 5 community extensions in the official Gen-Verse LatentMAS repository (766+ stars).

Technical Approach

Latent Collaboration: Agents communicate via hidden states, reducing token usage by 50-80% and achieving 3-7× speedup
S-LoRA Integration: Four specialized adapters (Medical, Reward, Comics, Detection) with dynamic hot-swapping
Multimodal Support: Qwen2.5-VL-7B-Instruct foundation with vision-language reasoning
Production Infrastructure: RunPod serverless deployment with Docker, CI/CD, and 200ms latency per agent
Advanced RAG: Document injection via URL, base64, JSON with domain-aware routing

75% VRAM Reduction

40% Latency Decrease

50-80% Token Reduction

3-7× Speedup

PyTorch Qwen2.5-VL-7B S-LoRA PEFT vLLM RunPod Docker CI/CD

View Code Watch Demo Read Paper

2025

AI Calling Agent Platform

Real-time voice conversation platform with SIP/WebRTC telephony integration. Achieved sub-500ms latency with emotionally expressive speech synthesis.

LiveKit GPT-4 WebRTC RunPod

2025

Multi-GPU Video Generation

Distributed inference pipeline for state-of-the-art video generation (text-to-video, image-to-video, speech-to-video). Achieved 3× throughput increase using FSDP on serverless GPU clusters.

PyTorch FSDP Modal DiT

2024

Custom LoRA Training Pipeline

Self-hosted Flux.1 Dev with custom LoRA fine-tuning infrastructure. Delivered 80% cost reduction for client photography workflows.

Flux LoRA Gradio Modal

2024

Enterprise RAG System

Multi-modal retrieval-augmented generation with vector search and semantic chunking. Achieved 40% accuracy improvement over baseline implementation.

LangChain Pinecone GPT-4 FastAPI

2024

Voice-Pro: Speech Processing Platform

Web application for speech recognition, translation, and voice cloning across 100+ languages. Supports YouTube processing and real-time translation.

Whisper F5-TTS Deep-Translator Python

2024

Medical Imaging with Transformers

Brain tumor classification and segmentation using ConvNeXt V2 and SegFormer. Achieved 99.6% diagnostic accuracy on evaluation dataset.

PyTorch ConvNeXt V2 SegFormer Transformers

Career

Professional Experience

Building production AI systems and conducting applied research

AI & Machine Learning Engineer

Freelance — Multiple Clients

2023 — Present

Develop and deploy cutting-edge ML/AI models specializing in multi-modal tasks including image generation, video synthesis, NLP, and voice AI
Design and implement serverless GPU infrastructure with Docker and Kubernetes, achieving 60%+ cost reduction
Build production RAG systems and multi-agent frameworks with measurable performance improvements

Research Assistant

Rajshahi University — Solar Lab / AI Lab

Mar 2022 — May 2023

Conducted research on renewable energy (solar cells) and speech processing using ML/DL techniques
Applied machine learning to analyze simulation data and improve photovoltaic performance
Published 4 peer-reviewed papers in Q1 journals with impact factors up to 7.1

Technical Expertise

Skills & Technologies

Full-stack ML engineering with production-grade tools and frameworks

AI & Machine Learning

Generative AI LLMs Multi-Agent Systems Deep Learning NLP Computer Vision RAG

MLOps & Cloud

Docker Kubernetes CI/CD AWS Azure ML Monitoring Logging

Frameworks & Tools

PyTorch TensorFlow HuggingFace LangChain vLLM FastAPI Gradio

Serverless GPU

RunPod Modal Replicate Lambda Labs FSDP DeepSpeed

Languages

Python SQL JavaScript Bash MATLAB

LLMs & Models

GPT-4 Claude Llama Qwen Flux LoRA/QLoRA

Research

Publications

7 peer-reviewed publications • 4 Q1 journals (IF up to 7.1) • Google Scholar Profile

Numerical prediction on the photovoltaic performance of CZTS-based thin film solar cell

Nano Select, 2023

Scopus

Spectrum estimation for voiced speech using average weighted linear prediction

2024

Speech Processing

Enhancement of Bone Conducted Speech Using Deep Transfer Learning

2024

Deep Learning

Contact

Get in Touch

Open to AI/ML Engineering roles, MLOps consulting, and collaborative research projects

Email

Joy.apee@gmail.com

Phone

+880 1521 417908

arifuzzaman-joy-ru

GitHub

@Arifuzzamanjoy

Production-Grade
AI & MLOps Systems

Selected Projects

LatentMAS-SLoRA: Multi-Agent Reasoning with S-LoRA

Technical Approach

AI Calling Agent Platform

Multi-GPU Video Generation

Custom LoRA Training Pipeline

Enterprise RAG System

Voice-Pro: Speech Processing Platform

Medical Imaging with Transformers

Professional Experience

AI & Machine Learning Engineer

Research Assistant

Skills & Technologies

AI & Machine Learning

MLOps & Cloud

Frameworks & Tools

Serverless GPU

Languages

LLMs & Models

Publications

Machine learning assisted revelation of the best performing single hetero-junction thermophotovoltaic cell

Machine Learning-Enabled Performance Exploration of AuCuSe₄ in Thermophotovoltaic Cell

Unleashing the Power of Open-Source Transformers in Medical Imaging

Numerical studies on a ternary AgInTe₂ chalcopyrite thin film solar cell

Numerical prediction on the photovoltaic performance of CZTS-based thin film solar cell

Spectrum estimation for voiced speech using average weighted linear prediction

Enhancement of Bone Conducted Speech Using Deep Transfer Learning

Get in Touch

Email

Phone

LinkedIn

GitHub

Production-Grade AI & MLOps Systems

Selected Projects

LatentMAS-SLoRA: Multi-Agent Reasoning with S-LoRA

Technical Approach

AI Calling Agent Platform

Multi-GPU Video Generation

Custom LoRA Training Pipeline

Enterprise RAG System

Voice-Pro: Speech Processing Platform

Medical Imaging with Transformers

Professional Experience

AI & Machine Learning Engineer

Research Assistant

Skills & Technologies

AI & Machine Learning

MLOps & Cloud

Frameworks & Tools

Serverless GPU

Languages

LLMs & Models

Publications

Machine learning assisted revelation of the best performing single hetero-junction thermophotovoltaic cell

Machine Learning-Enabled Performance Exploration of AuCuSe₄ in Thermophotovoltaic Cell

Unleashing the Power of Open-Source Transformers in Medical Imaging

Numerical studies on a ternary AgInTe₂ chalcopyrite thin film solar cell

Numerical prediction on the photovoltaic performance of CZTS-based thin film solar cell

Spectrum estimation for voiced speech using average weighted linear prediction

Enhancement of Bone Conducted Speech Using Deep Transfer Learning

Get in Touch

Email

Phone

LinkedIn

GitHub

Production-Grade
AI & MLOps Systems