About This Architecture
PTv3 inlier-outlier point classification pipeline uses concatenation, neural network embedding, and confidence-based masking to refine 3D point sets. Inlier and outlier point tensors (N×D each) are concatenated into a 2N×D tensor, processed through a PTv3 neural network to generate 2N×D_emb embeddings, then split and masked separately. Remove mask filters low-confidence inliers while add mask promotes high-confidence outliers, producing a final merged point classification for downstream tasks. This architecture demonstrates best practices for handling mixed-confidence point cloud data in computer vision and 3D perception workflows. Fork this diagram on Diagrams.so to customize tensor dimensions, threshold values, or integrate alternative backbone networks.