Papers
Sign in to view your remaining parses.
Tag Filter
Multi-Stage Training Framework
UrbanLLaVA: A Multi-modal Large Language Model for Urban Intelligence with Spatial Reasoning and Understanding
Published:6/29/2025
Urban Intelligence Multi-Modal Large Language ModelUrban Instruction DatasetSpatial Reasoning EnhancementMulti-Stage Training FrameworkUrban Task Performance Evaluation
UrbanLLaVA is a multimodal language model designed for urban intelligence, processing four data types to enhance urban task performance. It leverages a diverse instruction dataset and a multistage training framework, achieving strong crosscity generalization.
01
SAM 3D: 3Dfy Anything in Images
Published:11/21/2025
3D Object ReconstructionVisually Grounded 3D ReconstructionSingle Image 3D ReconstructionHuman-Machine Collaborative Data AnnotationMulti-Stage Training Framework
SAM 3D is a generative model for reconstructing 3D objects from a single image, using a humanintheloop pipeline. It excels in complex scenarios and achieves notable performance gains, winning 5:1 in realworld object preference tests.
07