Xiaolong Yang (杨小龙)

Data Platform, Tencent
Key Laboratory of Mathematics-Mechanization(KLMM) of Academy of Mathematics and Systems Science of the Chinese Academy of Sciences(AMSS, CAS)
University of the Chinese Academy of Sciences (UCAS)

Email: xiaolonyang@tencent.com [or] yangxiaolong17@mails.ucas.ac.cn

Biography

I am currently a researcher at Data Platform, Tencent. I received my PhD degree at Academy of Mathematics and Systems Science of the Chinese Academy of Sciences (AMSS, CAS) and University of Chinese Academy of Sciences (UCAS) . My supervisor is Prof. Xiaohong Jia (AMSS, CAS) and Prof. Dong-Ming Yan form National Laboratory of Pattern Recognition (NLPR) , Institute of Automation of CAS (CASIA). Before that, I received my Bachelor's degree of Information and Computing Science from Northwestern Polytechnical University (NWPU) in 2017.

My current research spot is AIGC, including large models and multi-modality. My research interest also includes computer graphics and computer vision. My resume can be found [ here ].

Publications

LARNeXt: End-to-End Lie Algebra Residual Network for Face Recognition
Xiaolong Yang, Xiaohong Jia, Dihong Gong, Dong-Ming Yan, Zhifeng Li, Wei Liu
IEEE Transactions on Pattern Analysis and Machine Intelligence(T-PAMI), Vol.45, No.10, pp.11961 - 11976, 2023.
[Project Page] [PDF] [Code] [Supplement] [Slides]
LARNet: Lie Algebra Residual Network for Face Recognition
Xiaolong Yang, Xiaohong Jia, Dihong Gong, Dong-Ming Yan, Zhifeng Li, Wei Liu
In Proceedings of the 38th International Conference on Machine Learning (ICML2021)
[Project Page] [PDF] [Code] [Supplement] [Slides]
6D Object Pose Estimation in Cluttered Scenes from RGB Images
Xiaolong Yang, Xiaohong Jia, Yuan Liang, Lubin Fan
Journal of Computer Science and Technology (JCST), Vol.37, No.3, pp.719-730, 2022.
[PDF] [Code] [Slides]
Simple Primitive Recognition via Hierarchical Face Clustering
Xiaolong Yang, Xiaohong Jia
Computational Visual Media (CVM), Vol.6, No.4, pp.431-443, 2020.
[PDF] [Code] [Patent]
6D Pose Estimation with Two-stream Net
Xiaolong Yang, Xiaohong Jia
ACM SIGGRAPH Posters, No.40, pp.1-2, 2020.
[PDF] [Code] [Slides&Talk]
Real-Time Facial Pose Estimation and Tracking by Coarse-to-Fine Iterative Optimization
Xiaolong Yang, Xiaohong Jia, Mengke Yuan, Dong-Ming Yan
Tsinghua Science and Technology (TST), Vol.25, No.5, pp.690-700, 2020.
[PDF] [Code]
Physical Model Analysis and Body Shape Modification of Platform Diving (跳台跳水的物理模型分析和体型修正)
Xiaolong Yang, RongPing Shen, Ziyin Zhang
First Prize of the 15th China Post-Graduate Mathematical Contest in Modeling, recommended to Journal of Mathematics in Practice and Theory, Vol.49, No.16, pp.35-45, 2019.
[PDF] [Prize]

Works

Dec.2023 - Now  
文生视频-双语视文CLIP
  • 实现基于中英文词表的扩充训练,CN图文检索超过当前最强中文模型,EN图文检索略微弱于OpenCLIP;
  • 改造图文接口为视文对象,训练视文CLIP并注重中文本地特色化的语义对齐,实现高性能中文的视文检索。
Aug.2023 - Dec.2023  
业务大模型(金融)-预训练、SFT的完整链路
  • 基于英文的llama2-70B模型拓展中文和金融能力,探索扩充词表和训练方案的最优组合;合作方Few-shot测试对比:(with llama2原始模型)英文能力+0,中文能力+7.5,金融能力+8.1;(with GPT3.5模型)英文能力-1.7,中文能力+5.6,金融能力+11;
  • 金融的指令微调训练,获得财报总结、行情异动分析等To B业务的可交互的定制化功能;
  • 性能优越的预训练模型提供给保险业务大模型、通用SFT大模型研究等多个业务作为base model使用。
Apr.2023 - Aug.2023  
通用大模型(混元助手)-强化学习:RM、PPO和DPO
  • 全面提升原有SFT模型(热、冷启动)在NLP基础任务、逻辑推理、多轮对话、领域应用等多个方面的薄弱不足项,版本发测多次领先;
  • 提出引导式推理方案,调整ppo-ptx中主模型和奖励模型的交互策略,高效解决线上用户体验中的重点badcase问题;数学、物理计算题多次测试正确率提高约三倍,首次正确率0%->87%,强化模型的泛化推理能力得到显著提升;
  • 176b RM的训练和迭代,最终奖励模型的验证精度(模型对同源数据的学习能力)为76.96% ,对比竞品llama2 RM: Safety 64.3%、Helpfulness 70.6%; 测试精度(模型对异源数据的泛化能力)为70.41% ,对比构造prompt使用GPT4自动打分44.0% 。
Feb.2023 - Apr.2023  
业务大模型(广告)-营销话术的SFT和强化学习
  • 实现基于医美、口腔、眼科、植发等重点行业信息的AIGC封闭式问答机器人;现阶段已经在消费医疗赛道-口腔打造正向案例,其中头部客户对比人工设置的问答,AI生成问答沉淀的线索有效率提升44%(37.93%→54.55%),成本降低21%(1110.0→918.4);
  • 基于客户喜好和转化需求,使用强化学习技术实现问答话术的个性化表达,兼具语言风格、交流能力、商品知识和营销导向等定制化的核心原子能力:拟人化程度高、态度亲切温和;理解商品相关行业属性知识图谱以及用户非专业表达的诉求;话术新鲜多样促进留存。
Jan.2023 - Feb.2023  
文生图-SOTA的复现以及超越性能的改造
  • 根据业务需求,完成闭源工作DiT的复现;改造latent空间为像素级别进行训练,使用更大的参数量的transformer结构代替U-Net;
  • ImageNet 256x256数据集上实验结果超过当前SOTA(FID:2.22>2.27)。
Oct.2022 - Feb.2023  
人脸识别-工业应用的高难度case解决方案
  • 人脸检索逻辑方案在人脸检测/识别性能上均大幅超越竞争对手,递交给合作方唯一突破0.9的高精度技术方案;
  • 实现了真人/漫画两种场景的集成一体化服务,无需针对漫画场景使用额外模型或切换服务底层调整参数;
  • 重点人物准召优化,大规模'涉政'人物召回指标0.9836远远高于竞品0.0717,二级标签acc指标为0.87,远超业务方需求0.8;
  • 高度模糊场景、真人/相片多人多脸混合场景、黑白旧照片锐化失真场景、戴口罩等遮挡场景、漫画夸张场景的定点优化。
Aug.2022 - Oct.2022  
隐私保护计算-大数据加密的高精度概率回归模型
  • 开发联合概率密度与错误标签后验概率的高精度模型,针对隐私加密通信数据具有极高的检索/判断精度。对于1K/5K/1W数量的未知加密样本,模型平均精度从98+提升至99.9/99.7/99.6;
  • PSI抽样特征从27降至7个,在实现高精度加密通信的同时单次运行效率达万分之三秒左右;
  • 参加iDASH 2022-Task 4,并在该赛道以99.39的超高精度超过蚂蚁集团(去年双赛道冠军,99.04)等所有竞争对手斩获模型精度第一,综合隐私通信效率后获得第二名。[Prize]
Jun.2022 - Aug.2022  
广告特征工程-视文相关性和精排特征实验
  • 通用TVR算法开发,实验结果对比多数前沿算法有超过15%的性能提升,与当前大模型SOTA方法HunYuan_TVR性能相当;
  • 广告业务场景TVR算法,R1结果表明潜在自动化审核处理效率预估能提高约10%,全行业测试集分类准确率87.3%;
  • 关联广告特征数据与用户特征数据,实现广告特征精排上线流程;完成公众号文章/广告双emb特征相关性推送上线。

Honorary Awards

2018  First Prize of the 15th China Post-Graduate Mathematical Contest in Modeling [ Top 1% ↑ ]
2019  Pacemaker to Merit Student of CAS (中科院三好学生标兵) [ Top 1% ↑ ]
2020-2022 (3 years) Excellent Merit Student of CAS (中科院优秀三好学生) [2020, 2021,2022]
2021  National Scholarship (国家奖学金) [ Top 1% ↑ ]
2022  CAS Presidential Scholarship (中科院院长奖) [ Top 1%↑ ]

Research Experiences

Sep.2020 - Sep.2021   Visting student (Rhino-Bird Program), Tencent Technologies Co., Ltd. (Shenzhen), directed by Dr. Zhifeng Li
Nov.2019 - Mar.2020   Intern student (Innovative Research Program), Alibaba Group (Beijing).
Mar.2017 - Sep.2021   Visiting student, department of Automation (Research) and department of Mathematics (CAD Project), Tsinghua University (Beijing).

Academic Services

Conference Reviewer:   NeurIPS, ICML, ICLR, CVPR, ICCV, ECCV, AAAI, CVM ...
Journal Reviewer:   T-PAMI, T-CSVT, T-NNLS, Neurocomputing, The Visual Computer ...