Topic

Cv

50 items across the graph — tagged with Cv.

From the graph · 50

repo
modelscope/modelscope

ModelScope: bring the notion of Model-as-a-Service to life.

repo
shimat/opencvsharp

OpenCV wrapper for .NET

repo
scverse/scanpy

Single-cell analysis in Python. Scales to >100M cells.

repo
emgucv/emgucv

Emgu CV is a cross platform .Net wrapper to the OpenCV image processing library.

repo
TimefoldAI/timefold-solver

The open source Solver AI for Java and Kotlin to optimize scheduling and routing. Solve the vehicle routing problem, employee rostering, task assignment, mainte…

repo
keras-team/keras-hub

Pretrained model hub for Keras 3.

repo
mosecorg/mosec

A high-performance ML model serving framework, offers dynamic batching and CPU/GPU pipelines to fully exploit your compute machine

repo
scverse/anndata

Annotated data.

repo
ROCm/MIVisionX

AMD MIVisionX is a computer vision toolkit built around a highly optimized, conformant open-source implementation of the Khronos OpenVX™ 1.3 specification. As o…

repo
capsulerun/vpod

Lightweight, secure linux sandboxes for untrusted processes.

repo
samuelhm/42Jobs

AI-powered job search platform for junior software engineers: job fetching, smart filtering, keyword extraction, and ATS-optimized CV generation. Built with .NE…

repo
Ruichen0424/ai-paper-explorer

AIPaperX: A one-stop search engine for top-tier AI conference & journal papers

repo
uvaishmohd307-sketch/GestureX

AI-powered virtual mouse using hand gestures built with Python, OpenCV and MediaPipe.

paper
DisciplineGen-1M: A Large-Scale Dataset for Multidisciplinary Visual Generation and Editing

Recent image generation and editing models can produce visually appealing natural images, yet they remain unreliable when the target image is a knowledge-intens…

paper
HiRes: A Hierarchical Cascaded Method for Resistor Value Identification

Accurate identification of resistor values from unconstrained images remains a challenging computer vision task due to variations in lighting, orientation, scal…

paper
ASTAD: Asymmetric Style Transfer for Synthetic-to-Real Adaptation in Autonomous Driving

Synthetic data mitigates the data scarcity problem in autonomous driving perception. However, the synthetic-to-real gap leads to performance degradation, hinder…

paper
Your Data Manifold is Secretly a Reward Model: Shell-LCC for Text-to-Video Generation

Recent text-to-video (T2V) diffusion models rely heavily on auxiliary reward signals (e.g., via reward models or DPO) to align generated content with human aest…

paper
Efficient PEFT Methods with Adaptive Checkpointing for Vision Models and VLMs on Resource Constrained Consumer-GPUs

Modern pretrained vision models achieve strong accuracy but demand substantial GPU memory for fine-tuning, making edge deployment impractical. This paper compar…

paper
High-dimensional Embedding Prior for Noisy K-space Domain MRIReconstruction

Magnetic resonance imaging (MRI) reconstruction under realistic acquisition conditions can be fundamentally viewed as estimating the underlying k-space distribu…

paper
AnyBokeh: Physics-Guided Any-to-Any Bokeh Editing with Optical Fingerprint Transfer

Depth-of-field control is a fundamental tool in photography, yet post-capture bokeh editing from a single image remains challenging. A practical editor should h…

paper
FR-DETR: Frequency and Recurrent Feature Refinement for Robust Object Detection under Adverse Weather

Object detection under adverse weather remains challenging due to severe visual degradations and domain shifts. Existing enhancer-based approaches attempt to im…

paper
Perceive-to-Reason: Decoupling Perception and Reasoning for Fine-Grained Visual Reasoning

Fine-grained visual reasoning remains challenging for vision-language models, especially when small but critical visual cues are buried in high-resolution image…

paper
PS-MOT: Cultivating Instance Awareness from Point Seeds for Multi-Object Tracking

We introduce Point-supervised Multi-Object Tracking (PS-MOT) as a cost-effective alternative to traditional bounding box supervision, shifting the focus from sp…

paper
Hyper-Network Neural Functional Maps for Unsupervised Robust 3D Shape Matching

Functional maps are the cornerstone of recent non-rigid 3D shape matching methods due to their efficiency and performance. However, existing methods struggle wi…

paper
CPDDNet: Color-Polarization Denoising and Demosaicking Network

Color-polarization imaging using a color-polarization filter array (CPFA) sensor captures both texture (color intensity) and physical (polarization) information…

paper
Real-Time Visual Intelligence on Low-Cost UAVs: A Modular Approach for Tracking, Scanning, and Navigation

Autonomous drones are rapidly transforming modern warfare and civil applications alike. This paper presents the development of an integrated intelligent drone s…

paper
RESOLVE: A Multi-Resolution and Multi-Modal Dataset for Roadside Cooperative Perception

LiDAR has increasingly been integrated into traffic cameras to expand coverage and mitigate occlusion in roadside cooperative perception. However, how unimodal…

paper
Automated Background Swapping for Robustness against Spurious Backgrounds

Classifiers based on Deep Neural Networks exhibit strong performance across domains, yet can fail catastrophically if they rely on spurious correlations, i.e.,…

paper
Seek to Segment: Active Perception for Panoramic Referring Segmentation

Existing referring segmentation models passively process static images captured from fixed perspectives, limiting their applicability in Embodied AI, where agen…

paper
3D Scene-Adaptive Trajectory-Controllable Human Image Animation with Camera Movement

Human image animation, which aims to generate a video of a reference subject following a provided action sequence, has received increasing research interest. Wi…

paper
W4A4 Quantization for Inference on Wan2.2-I2V-A14B

We summarize our submission to Sub-Challenge 1: W4A4 Quantization for Inference (HiF4 / MXFP4) of the ICME 2026 Low-Bit-width Large-Model Quantization Challenge…

paper
PerceptionRubrics: Calibrating Multimodal Evaluation to Human Perception

We introduce PerceptionRubrics, a rubric-based evaluation framework that addresses the gap between saturated benchmark scores and real-world brittleness. Shifti…

paper
FLORA: A deep learning approach to predict forest attributes from heterogeneous LiDAR data

Forest attributes are essential for national-scale resource monitoring. Airborne LiDAR metrics are among the auxiliary variables most strongly correlated with f…

paper
Towards Metric-Agnostic Trajectory Forecasting

Accurate trajectory forecasting of surrounding traffic participants is a core capability for autonomous driving, enabling vehicles to anticipate behavior and pl…

paper
QuaMoE-DRF: Proactive Beam and Rate Adaptation via Multimodal Dynamic Radio Map Forecasting in ISAC Networks

Static radio maps provide location-dependent propagation priors, but they cannot capture short-term blockage caused by moving objects. Direct sensing-assisted b…

paper
GaussianEmoTalker: Real-Time Emotional Talking Head Synthesis with Audio-Driven and Blendshape-Based 3D Gaussian Splatting

Audio-driven talking head synthesis has achieved impressive progress in lip synchronization and visual quality, yet generating expressive emotional avatars with…

paper
SuperFlex: Deformable Superquadrics for Point Cloud Decomposition

Superquadrics have proven to provide a compact, geometrically meaningful representation for 3D objects. However, existing methods suffer from limited reconstruc…

paper
D$^{2}$R$^{2}$OSR: Degradation-Disentangled Representation for Real-World Omnidirectional Image Super-Resolution

With the growing demand for immersive visual experiences, high-quality omnidirectional images (ODIs) have become increasingly important. However, limitations in…

paper
RayPE: Ray-Space Positional Encoding for 3D-Aware Video Generation

Modern video diffusion transformers position their tokens through RoPE on the (u,v,t) axes -- a description of the camera's sampling grid that says nothing abou…

paper
Learning to Evolve Scenes: Reasoning about Human Activities with Scene Graphs

Understanding human behavior while interacting with the surrounding world is crucial for many applications of embodied AI. First-person videos are particularly…

paper
Exact and Deterministic Patch Descriptor Retrieval via Hierarchical Normalization

We present a patch descriptor retrieval method that returns the exact nearest neighbour -- provably identical to exhaustive full-vector search -- while evaluati…

paper
Learning from Reliable Latent Prompts for Visual Recognition with Missing Modalities

Large-scale multimodal models (LMMs) have achieved superior performance in visual recognition by synergizing information across diverse, massive-scale paired mo…

paper
GMO-E$^2$DIT: Grounded Multi-Operation Editing for E-Commerce Images

Real-world e-commerce image editing often requires multiple, localized, and auditable operations rather than global restyling. This compositional nature poses a…

paper
MVP-Nav: Multi-layer Value Map Planner Navigator

Zero-shot Object Goal Navigation (ZSON) with RGB-only perception poses a fundamental challenge for embodied agents, as the absence of explicit depth information…

paper
MirrorPPR: Exemplar-Based Portrait Photo Retouching

While text-guided image editing has made remarkable progress, it remains limited in structural portrait retouching. Textual descriptions struggle to convey fine…

paper
Sculpting NeRF Geometry: Human-Preference Fine-Tuning of a 3D-Aware Face GAN

Reinforcement learning from human feedback (RLHF) for 3D generation is now established across a number of works, but most existing pipelines optimise explicit s…

paper
RSICCLLM: A Multimodal Large Language Model for Remote Sensing Image Change Captioning

Remote Sensing Image Change Captioning (RSICC) aims to describe changes between bi-temporal remote sensing images and holds significant research and application…

paper
SatSplatDiff: Geometry-preserving generative refinement for high-fidelity satellite Gaussian Splatting

Gaussian Splatting has been recently explored for satellite 3D reconstruction, demonstrating flexibility and efficiency in representing radiometrically diverse…

paper
No Place to Hide: Benchmarking Video Hallucination with Background-Controlled Pairs

We introduce VidPair-Halluc, a new benchmark for evaluating video hallucination in large video models (LVMs) under rigorous and controlled conditions. Unlike pr…

paper
Ink3D: Sculpting 3D Assets with Extremely Complex Textures via Video Generative Models

Recent 3D generative models can synthesize high-quality geometry but often struggle to reproduce intricate textures from reference images, largely due to the sc…

Related topics