person profile

Han

Han — researcher or builder tracked in the Angestrom contributor network.

52Connections

8Papers

0Models

5Repos

15News

Repositories · 5

ultralytics/ultralytics

Ultralytics YOLO26, YOLO11, YOLOv8 — object detection, instance segmentation, semantic segmentation, image classification, pose estimation, object tracking

bpschauhan/krishiAI

KrishiAI is a production-grade monorepo foundation for an AI-powered agricultural operating system focused on farmers in Uttar Pradesh, India.

chandler.bing/ecoscan-26940117-deletion_scheduled-84023208

A project providing digital recognition of handwritten documents using machine learning techniques. https://escriptorium.eu/

luwhano/fastapi-langgraph-agent-production-ready-template

A production-ready FastAPI template for building AI agent applications with LangGraph integration. This template provides a robust foundation for building scalable, secure, and maintainable AI agent services.

News · 15

Palantir CEO says US government customers are shifting to open-source AI, and it matters for more than just defense stocks - Crypto Briefing

<a href="https://news.google.com/rss/articles/CBMid0FVX3lxTE53S2FnakQ0RmRyNGRpY0dBN0taTTZmUnBnTkVHX3RKWUdfWW9NZlp0eFFtSVd2S1FFc2tvLUt5LVQ4U1czVzlvR3QzZzAxRWZueGpWWERMTkw0bGFfU3dYUlcwLXYyV1ZldkZ2c1c2cU1YanpDTTUw?oc=5" target="_blank">Palantir CEO says US government customers are shifting to open-source AI, and it matters for more than just defense stocks</a>  Crypto Briefing

Video Friday: AI Gives Robot Hands Humanlike Dexterity

<img src="https://spectrum.ieee.org/media-library/robot-hand-grips-a-blender-pitcher-to-pour-a-pink-smoothie-into-a-cup-held-in-another-robot-hand.png?id=66709264&width=1200&height=600&coordinates=0%2C125%2C0%2C125" /> Video Friday is your weekly selection of awesome robotics videos, collected by your friends at IEEE Spectrum robotics. We also post a weekly calendar of upcoming robotics events for the next few months. Please <a href="m

[Research] JetSpec: Speculative Decoding with Parallel Tree Drafting Enables up to 9.64x Lossless LLM Inference Speedup with more than 1000TPS

<table> <tr><td> <a href="https://www.reddit.com/r/LocalLLaMA/comments/1ufntl5/research_jetspec_speculative_decoding_with/"> <img alt="[Research] JetSpec: Speculative Decoding with Parallel Tree Drafting Enables up to 9.64x Lossless LLM Inference Speedup with more than 1000TPS" src="https://preview.redd.it/dquco5yy2i9h1.png?width=140&height=48&auto=webp&s=31f3135d4df3db83738553f67099f63b1060f193" title="[Research] JetSpec: Speculative Decoding with Parallel Tree Drafting Enables up t

Been running Qwen3.6-27B through a 3-critic harness. The harness matters more than I thought

<div class="md">Been running Qwen3.6-27B (8-bit) through my coding harness for a few days, alongside GLM5.2. The harness uses 3 critics — code review, test review, Playwright e2e — each with fresh context before accepting output. Qwen3.6 is legit for a 27B dense model. Benchmarks weren't lying. It handles repo-level reasoning, produces decent code. But yeah it makes more mistakes than frontier models. Expected. What I didn't expect was that the 3-critic pipeline I

[audio.cpp] VibeVoice 1.5B released — 90-min podcast in 22.95 min, 4.08x real-time, 2.86x faster than Python without quantization. Native C++/ggml

<div class="md">I’m the author of audio.cpp, a C++/ggml runtime for local audio models. I just added VibeVoice 1.5B support and wanted to share the benchmark because long-form multi-speaker TTS is a good stress test for local inference runtimes. Result on RTX 5090: VibeVoice 1.5B Audio length: 5615.73s / 93.60 min Wall time: 1376.84s / 22.95 min RTF: 0.245 Speed: 4.08x faster than real time Python baseline: 92.66 min audio in 6

Crypto exchange OKX wants AI agents to hire and pay each other

OKX is bringing together payments, identity, and reputation into a marketplace for AI agents.

Hotter Than a Hot Tub: The 45°C Breakthrough to Cool AI’s Biggest Machines

Hot tubs sit at about 38 to 40 degrees Celsius, warm enough that most people can only soak for about 15 minutes. NVIDIA’s newest AI servers can run their cooling liquid even hotter — up to 45 degrees Celsius, or 113 degrees Fahrenheit. That higher temperature limit is precisely what makes them more energy efficient. […]

Hankins Holdings LLC Invests in Artificial Intelligence to Meet the Next Generation of Consumer Expectations - EIN News

<a href="https://news.google.com/rss/articles/CBMi5gFBVV95cUxNN3lxX1BsNDZsQkpTRHR5enFsd05wQXpLRnF1cERlUWV0V2laUXNtZ0x6Y1MyTU8xQ2JRVzhWNFkyV0M2NHp4R1JnMXNrSWpLdmpHY0xoTzRHYXBiNFlhV0xfamxrcWxNYWlubXNvVU9pSlktWUVwMmVhYzBIZU5OZVhMc21rWXMyc1JILUFlZXI3NzVhb0F3ZnZqVUFZeHB6MXRmVVJ2cEVTaVBMNFp5T0hvd0ZuNXFBa3hxYWJrM3lkRDhYNDlaX3EzUlduby1HMWl4QUxLdXhTRk4tQThSRVdnTTVXd9IB6wFBVV95cUxPM09MQUY2ZkhuSktSWDNjOHNmeEVwNVdqemJTRTh6aElpdWloaGo0b3FUbGVwdXRJU2FsNHRkaklLcFlSY0taeVVfOEk0S2FpMHBVb2JjRmpaMW5SWTU5T09HWVJia3

Papers · 8

Geometry-Aware Cross-Height Channel Knowledge Map Prediction for UAV-Assisted Communications With Uncertainty-Guided 3D Sensing

Low-altitude Unmanned Aerial Vehicles (UAVs) often need to infer channel knowledge across a range of heights from only sparse observations collected at a few altitude layers. To address this challenge, this paper studies height-conditioned cross-height channel knowledge map (CKM) prediction for UAV-assisted communications in geometry-rich urban environments. We develop a geometry-aware conditional prediction framework that combines urban scene priors, sparse multi-altitude observations, and target-height descriptors to reconstruct dense CKMs at unobserved target heights. An uncertainty head is

DexCompose: Reusing Dexterous Policies for Multi-Task Manipulation with a Single Hand

Dexterous manipulation policies can solve individual skills, but composing them to perform multiple tasks with a single hand remains challenging. Adding a new task on top of an existing manipulation skill often imposes conflicting demands on overlapping fingers and contact modes, causing destructive interference between preserving an existing manipulation outcome and executing a new one. We propose DexCompose, a role-aware residual composition framework that reuses pretrained dexterous policies for multi-task manipulation through explicit finger-level action ownership. Given two pretrained ful

Wireless Backdoor Attack and Defense for Semantic Communications over Multiple Access Channel

Semantic communication (SemCom) aims to preserve semantic meaning and task-oriented information beyond conventional message recovery over wireless channels. The adoption of SemCom in shared-access wireless networks introduces new vulnerabilities for multi-user semantic inference. This paper considers a SemCom system for two transmitters communicating with a common receiver over a multiple access channel. Each transmitter maps source information into latent semantic representations, while the receiver jointly reconstructs and classifies the semantic information for both transmitters. A selectiv

Towards in-the-wild Egocentric 3D Hand-Object Pose Estimation

Estimating accurate 3D hand-object pose from in-the-wild egocentric RGB remains challenging due to severe occlusions and ambiguous contact. Existing learning-based methods often struggle to generalise to in-the-wild scenes and are limited by the scarcity of supervision. We address these issues with two contributions. First, we introduce EPIC-Contact, an in-the-wild egocentric dataset of 2.3K clips (62.3K frames) with dense, bijective 3D hand-object contact correspondences and posed meshes. Second, we propose HOPformer, an end-to-end transformer that jointly predicts bi-manual hand and object p

Beyond Surface Forms: A Comprehensive, Mechanism-Oriented Taxonomy of Indirect Linguistic Encoding for LLM-Based Coded Language Detection

To avoid moderation and surveillance on social media, some users routinely invent indirect linguistic expressions (ILE) that camouflage sensitive meanings. Such expressions surface as algospeak, euphemisms, and adversarial obfuscation, depending on intent and context, and they involve recurring encoding mechanisms. We propose a comprehensive, mechanism-oriented taxonomy of ILE that abstracts away from communicative goals and instead categorizes the underlying operations through which meaning is encoded and recovered. We evaluate the taxonomy by incorporating it into LLM prompts and comparing i

Mechanism-Driven Monitors for Preemptive Detection of LLM Training Instability

Frontier large language model training consumes massive accelerator fleets and long wall-clock computation, making stability failures costly when they occur. After a numerical or a hyperparameter fault has already destabilized the training dynamics, it may continue for thousands of steps while loss and gradient norms still appear normal. We study mechanism-driven detection of training instability by deriving internal monitors from the functional role of each critical module and from the earliest computational sites where failures are expected to produce measurable signatures. For low-precision

PhysiFormer: Learning to Simulate Mechanics in World Space

We present PhysiFormer, a diffusion transformer for physically-plausible 3D object motion. Unlike video world models that operate in view-dependent pixel space, PhysiFormer represents objects as 3D meshes expressed in world coordinates. Given the initial vertex positions and velocities, as well as object material type, rigid or elastic, the model samples future vertex trajectories. While related neural physics approaches build on ad-hoc latent spaces or explicitly enforce rigidity and causality, PhysiFormer shows that excellent results can be obtained without any such inductive biases, by cast

MixTTA: Low-Rank Cross-Channel Mixing for Reliable Test-Time Adaptation

Test-Time Adaptation (TTA) methods commonly update the affine parameters of normalization layers to adapt deployed models under distribution shifts. However, per-channel affine parameters perform axis-aligned scaling and shifting, making them geometrically incapable of correcting cross-channel structural changes induced by distribution shift. To address this limitation, we propose MixTTA, a lightweight plug-in module that equips normalization layers with a low-rank cross-channel transformation, enabling inter-channel mixing at each layer. To ensure that the low-rank branch captures only cross-

Han