曹毅超(Yichao Cao),博士,特聘副教授,硕士生导师。研究方向包括多模态大模型、具身智能、计算机视觉等。近五年在CVPR、ICCV、NeurIPS、ICML、AAAI、ACMMM、ICCAD、DAC、ICASSP、Pattern Recognition、IEEE TCSVT等高水平国际会议与期刊上录用/发表论文20余篇,部分成果获AAAI 2026 Oral与ICCV 2025 Highlight,授权国家发明专利10余项,长期担任CVPR、ICCV、NeurIPS、ICML、ICLR、ACMMM、IJCAI、ECCV、IEEE TMM、IEEE TNNLS、IEEE TCSVT等会议和TOP期刊的审稿人,担任IJCAI等会议的高级程序委员会委员。Google Scholar个人主页:https://scholar.google.com/citations?user=--8h8o0AAAAJ&hl=zh-CN。【研究方向】※ 多模态大模型(VLM大模型的持续演化和能力优化);※ 具身智能(具身VTLA大模型的感知、记忆、泛化等);※ 包括但不限于其他有趣的方向;【招生信息】每年招收1-2名硕士研究生,会与学生保持紧密的科研合作,从实验设计到论文写作,给予细致入微的全程指导。课题组不仅提供充足的科研资源,更注重激发独立思考的兴趣与能力,力争产出高水平成果。【更新:目前还有0名硕士名额】【企业经历】企业任职期间,主持林业/草原行业首个多模态大模型“林海思绪(ForestMind)”的研发(大模型备案号:Jiangsu-LinHaiSiXu-202508130039),主持的“端-边-云协同林火识别引擎及一体化解决方案”获得国际先进成果评价,所主持研发的林火识别解决方案已成功应用于我国20多个省、60多个地市,大幅推动了林草行业智能化发展,建立了一批行业示范性项目,并参与起草了林草、应急行业的多项行业标准。【近五年代表性论文】[24] TriPAH: Imbalance-Aware Tri-Prompt Affinity Hashing for Cross-Modal Medical Retrieval. The IEEE International Conference on Multimedia & Expo 2026 (ICME, CCF B, 共同通讯), 2026[23] CORE: Collaborative Observer-Reasoner Execution via Multi-Agents for Smart Home. The IEEE International Conference on Multimedia & Expo 2026 (ICME, CCF B, 共同通讯), 2026[22] A lightweight physics-aware framework for multi-scale marine heatwaves forecasting (npj Climate and Atmospheric Science, Nature Partner Journals), 2026[21] Learning Distribution-wise Foundation Prior Consistency and Instance-wise Style Calibration for Medical Image Generalization. Computer Vision and Pattern Recognition (CVPR, CCF A), 2026[20] Localizing, Structuring, and Rendering: Bridging 3D and 2D Vision-Language-Action Models for Robotic Manipulation. Computer Vision and Pattern Recognition (CVPR, CCF A), 2026[19] CompTrack: Information Bottleneck‑Guided Low-Rank Dynamic Token Compression for Point Cloud Tracking. AAAI Conference on Artificial Inteligence (AAAI Oral, CCF A), 2026[18] FocusTrack: One-Stage Focus-and-Suppress Framework for 3D Point Cloud Object Tracking. ACM Multimedia (ACM MM, CCF A), 2025[17] Refining the Granularity of Smoke Representation: SAM-Powered Density-Aware Progressive Smoke Segmentation Framework. Pattern Recognition (PR, 中科院1区TOP, 第一作者), 2025[16] CounterPC: Counterfactual Feature Realignment for Unsupervised Domain Adaptation on Point Clouds. International Conference on Computer Vision (ICCV Highlight, CCF A), 2025[15] TinyMIG: Transferring Generalization from Vision Foundation Models to Single-Domain Medical Imaging. International Conference on Machine Learning (ICML, CCF A), 2025[14] Perturbating, Tuning, and Collaborating: Harnessing Vision Foundation Models for Single Domain Generalization on Medical Imaging. AAAI Conference on Artificial Intelligence (AAAI, CCF A, 共同一作), 2025[13] A Novel Image-Graph Heterogeneous Fusion Framework for Static IR Drop Prediction. Design Automation Conference (DAC, CCF A), 2025[12] Variational Feature Imitation Conditioned on Visual Descriptions for Few-Shot Fine-Grained Recognition. IEEE Transactions on Circuits and Systems for Video Technology (IEEE TCSVT, 中科院1区TOP), 2025[11] SmokeAgent: Multimodal Agent for Fine-Grained Smoke Event Analysis in Large-Scale Wild Environments. Pattern Recognition (PR, 中科院1区TOP, 第一作者), 2025[10] Debiased Prototype Evolving for Point Cloud Domain Adaptation via 3D Foundation Models. International Conference on Acoustics, Speech, and Signal Processing (ICASSP, CCF B), 2025[9] A Geometry-Material Aware Point Cloud Transformer for Large-scale Unstructured Thermal Analysis in 2.5D ICs. IEEE/ACM International Conference on Computer-Aided Design (ICCAD, CCF B), 2025[8] Detecting Any Human-Object Interaction Relationship: Universal HOI Detector with Spatial Prompt Learning on Foundation Models. Advances in Neural Information Processing Systems (NeurIPS, CCF A, 第一作者), 2024[7] Universal Frequency Domain Perturbation for Single-Source Domain Generalization. ACM Multimedia (ACM MM, CCF A, 共同一作), 2024[6] Re-mine, Learn and Reason: Exploring the Cross-modal Semantic Correlations for Language-guided HOI detection. International Conference on Computer Vision (ICCV, CCF A, 第一作者), 2023[5] Coarse2Fine: Local Consistency Aware Re-prediction for Weakly Supervised Object Localization. AAAI Conference on Artificial Intelligence (AAAI, CCF A), 2023[4] Attributes Grouping and Mining Hashing for Fine-Grained Image Retrieval. ACM Multimedia (ACM MM, CCF A), 2023[3] Searching for Better Spatio-temporal Alignment in Few-Shot Action Recognition. Neural Information Processing Systems (NeurIPS, CCF A, 共同一作), 2022[2] EFFNet: Enhanced Feature Foreground Network for Video Smoke Source Prediction and Detection. IEEE Transactions on Circuits and Systems for Video Technology (IEEE TCSVT, 中科院1区TOP, 第一作者), 2022[1] Combining the Convolution and Transformer for Classification of Smoke-Like Scenes in Remote Sensing Images. IEEE Transactions on Geoscience and Remote Sensing (IEEE TGRS, 中科院1区TOP), 2022【已授权专利】[6] 基于多光谱的森林火灾识别方法、系统、程序及存储介质,专利号:ZL202011122717.5(第一发明人)[5] 一种烟雾识别方法及装置、电子设备,专利号:ZL202110144582.0(第一发明人)[4] 森林火灾源头的估计模型训练方法、估计方法及系统,专利号:ZL202110097330.7(第一发明人)[3] 烟火检测模型的训练方法、烟火检测方法及设备,专利号:ZL202110215838.2(第一发明人)[2] 一种基于循环卷积神经网络的黑烟车检测方法,专利号:ZL201811143567.9(已授权发明专利,第二发明人)[1] 语音播报及视频抓拍多功能监测站,专利号:ZL201830024514.X(第一发明人)【联系方式】caoyichao@csu.edu.cn
No content