Yihao Liu

Ph.D. student at LCSR, Johns Hopkins University.

Hi there! 👋

My name is Yihao, and I am a Ph.D. student in Computer Science at Johns Hopkins University. My research lies at the intersection of medical robotics, mixed reality, and artificial intelligence. I focus on language-conditioned manipulation, particularly its modeling in reasoning-intensive and long-horizon tasks. To support this work, I also develop medical robotic infrastructure that enables more capable and intelligent systems.

My Ph.D. advisor is Prof. Mehran Armand from Dept. Com. Sci., Mech. Eng., Ortho. Surgery and Appied Physics Lab. I was advised by Prof. Peter Kazanzides during my MSE and Prof. Zheng Liu and Prof. Jian Liu during my undergraduate program. Over the last few years, I had great times interning at leading robotics startups Moon Surgical and Rokae Robotics.

An About Me page is available here.

Currently Working On

At the center of my recent works is the State Machine Serialization Language (SMSL). I am investigating the optimal approaches to formalize the mapping of complex interlaced workflow to structured subtask space and their transitions. It has profound implications in medical or surgical robotics. The major issue is the level of complexity based on the semantics of a scene, so task hierarchy and open-ended-ness are my focus. Such structure should be able to be captured by an agent, and passed to downstream VLA models for execution.

Selected Publications

* Serving as corresponding author

Showing 13 publications

dARt Vinci: Egocentric Data Collection for Surgical Robot Learning at Scale

Y Liu*, YC Ku, J Zhang, H Ding, P Kazanzides, M Armand

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2025, Accepted

[Paper] [Code (Coming Soon)] [Video]

The paper presents dARt Vinci, a scalable data collection platform for robot learning in surgical settings that leverages Augmented Reality (AR) and a high-fidelity physics engine to capture nuanced surgical maneuvers without requiring a physical robot setup.

Look Before You Leap: Using Serialized State Machine for Language Conditioned Robotic Manipulation

T Mu, Y Liu*, M Armand

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2025, Accepted

[Paper] [Code (Coming Soon)] [Video]

This paper proposes a framework that uses serialized Finite State Machines (FSMs) to generate demonstrations for imitation learning in robotic manipulation. The method addresses the issue of cascading errors caused by incomplete demonstration coverage, especially in long-horizon, precision-heavy tasks. Shortly after, Apple's Illusion of Thinking used the same 3 puzzles in analyzing LLM's reasoning capability.

A Roadmap Towards Automated and Regulated Robotic Systems

Y Liu*, M Armand

arXiv Preprint, 2024

[Paper] [Code]

This paper proposes a roadmap for integrating generative AI into regulated and autonomous robotic systems, particularly for safety-critical domains like medicine. We argue that while generative models are suitable for low-level tasks, their outputs should undergo regulatory oversight before robotic execution. State Machine Serialization Language (SMSL) is introduced. Neuroscience research provides compelling evidence, notably in Sun et al Feb 12, 2025 Nature paper on hippocampal learning, that the brain organizes neural activity into a state machine during learning. Also in robotics, on symbolic planning.

An Image-Guided Robotic System for Transcranial Magnetic Stimulation

Y Liu*, J Zhang, L Ai, J Tian, S Sefati, H Liu, A Martin-Gomez, A Kheradmand, M Armand

IEEE Robotics and Automation Letters (RA-L), 2025

[Paper] [Code (ROS)] [Code (Neuronavigation)] [Video]

This paper presents an image-guided robotic system for transcranial magnetic stimulation (TMS) that aims to standardize coil pose planning based on detailed brain anatomy and improve placement accuracy. The system addresses limitations of manual TMS by introducing a reference pose and enhancing repeatability through robotic actuation.

EVD Surgical Guidance with Retro-reflective Tool Tracking and Spatial Reconstruction using Head-Mounted Augmented Reality Device

H Li, W Yan, D Liu, L Qian, Y Yang, Y Liu, Z Zhao, H Ding, G Wang

IEEE Transactions on Visualization and Computer Graphics (TVCG), 2024

[Paper]

This paper introduces an AR-based guidance framework for External Ventricular Drain (EVD) surgery that leverages Time of Flight (ToF) depth sensors for accurate, non-invasive tracking and surface reconstruction. We build a correction method reduces errors by over 85%.

Calibration of Augmented Reality Headset with External Tracking System Using AX=YB

L Ai, Y Liu*, M Armand, A Martin-Gomez

IEEE International Symposium on Mixed and Augmented Reality (ISMAR), 2024

[Paper] [Code]

This paper proposes a robust virtual-to-real calibration method for Augmented Reality by formulating the problem as a hand-eye/robot-world calibration using the AX=YB model. The approach leverages both head-mounted display self-localization and external tracking, with additional techniques for data synchronization and outlier filtering based on the Frobenius norm.

Autokinesis Reveals a Threshold for Perception of Visual Motion

Y Liu, J Tian, A Martin-Gomez, Q Arshad, M Armand, A Kheradmand

Elsevier Neuroscience, 2024

[Paper] [Code]

This study investigates how autokinesis—illusory motion perception in dark environments—affects visual motion perception in humans. Using a novel optical tracking method, we found that at lower motion speeds in darkness, perceived motion aligned more with autokinesis, while brighter environments or higher speeds improved accuracy.

Segment Any Medical Model Extended

Y Liu*, J Zhang, A Diaz-Pinto, H Li, A Martin-Gomez, A Kheradmand, M Armand

SPIE Medical Imaging, 2024

[Paper] [Code]

This paper presents SAMM Extended (SAMME), a unified platform designed to enhance and evaluate the Segment Anything Model (SAM) for medical image segmentation. SAMME supports variant integration, fine-tuning, new interaction modes, and improved communication protocols.

On the Fly Robotic-Assisted Medical Instrument Planning and Execution Using Mixed Reality

L Ai, Y Liu*, M Armand, A Kheradmand, A Martin-Gomez

IEEE International Conference on Robotics and Automation (ICRA), 2024

[Paper] [Video]

This paper presents a mixed reality framework designed to reduce the complexity of operating Robotic-Assisted Medical Systems (RAMS) by enhancing human-robot interaction. The system enables real-time planning and execution through 3D anatomical overlays, collision detection, and an intuitive robot programming interface, supported by a user-friendly head-mounted display calibration.

GBEC: Geometry-Based Hand-Eye Calibration

Y Liu*, J Zhang, Z She, A Kheradmand, M Armand

IEEE International Conference on Robotics and Automation (ICRA), 2024

[Paper] [Video]

This paper introduces Geometry-Based End-Effector Calibration (GBEC), a novel method for hand-eye calibration that relies solely on the physical geometry between the robot end-effector and the attached sensor, improving accuracy and repeatability over traditional pose-based methods like AXXB.

Realtime Robust Shape Estimation of Deformable Linear Object

J Zhang, Z Zhang, Y Liu*, Y Chen, A Kheradmand, M Armand

IEEE International Conference on Robotics and Automation (ICRA), 2024

[Paper] [Code] [Video]

This paper presents a robust, real-time shape estimation method for linear deformable objects using scattered, unordered key points without the need for dense point clouds or fully visible markers. A probability-based labeling algorithm determines the correct order of key points, followed by piecewise spline interpolation to reconstruct the object's shape.

SAMM (Segment Any Medical Model): A 3D Slicer Integration to SAM

Y Liu*, J Zhang, Z She, A Kheradmand, M Armand

arXiv Preprint, 2023

[Paper] [Code] [Video]

This extension is the first working medical segmentation tool using the foundation model Segment Anything (SAM). SAMM enables near real-time image mask inference with a latency of just 0.6 seconds, facilitating the development, evaluation, and application of SAM on medical images.