Projects
My research focuses on understanding human vision and cognition, and on improving the quantity and quality of AI data.
Reflect-DiT
Inference-Time Scaling for Text-to-Image Diffusion Transformers via In-Context Reflection
VideoMultiAgents
A Multi-Agent Framework that integrates specialized agents for vision, scene graph analysis, and text processing
SegLLM
A multi-round conversation agent capable of localizing objects following natural language insturtcions
Panasonic-LLM-100b
Development of Japan's largest scale (100 billion parameters) Japanese LLM with Stockmark Inc.
Diffusion-KTO
Aligning text-to-image diffusion models by formulating the alignment objective as the maximization of expected human utility
Wild2Avatar
A method to render high fidelity human avatars from in-the-wild monocular videos behind occlusions
Invisible-to-Visible
Privacy-Aware Human Segmentation using Airborne Ultrasound via Collaborative Learning Probabilistic U-Net
CFLOW-AD
Real-Time Unsupervised Anomaly Detection with Localization via Conditional Normalizing Flows
Contrastive Neural Processes
A new method for self-supervised learning that does not require augmentation engineering
Medical Image Retrieval
A technology to search for similar cases by reproducing the points that doctors focus on during diagnosis
Brain Machine Interface
An estimate technology using Brain Wave Patterns for Acceptable Maximum Sound Volume of Hearing Aids