The Computer Vision Defect Detection for Manufacturing project delivers an AI-powered system for real-time identification of surface defects like cracks, dents, and scratches on products. It ingests images from manufacturing lines, prepares datasets with auto-annotation and augmentation, trains a YOLOv8 model using TensorFlow, converts to ONNX for optimized inference, and integrates with an edge device dashboard for visualization and alerts. The solution achieves 96% accuracy, <50ms inference, reduces manual labeling by 80%, and enhances quality control, completed over 10 weeks from September to November 2025 as a proof-of-concept for client adoption.
The architecture follows an end-to-end pipeline: data is collected and processed via ETL into a warehouse, augmented and annotated for training, modeled with YOLOv8 and TensorFlow for detection, optimized/exported to ONNX for edge inference, and visualized through a dashboard on devices like NVIDIA Jetson. This design ensures real-time performance under varying conditions, data traceability via warehousing, and scalability for production, focusing on defect types, bounding boxes, and alerts for manufacturing efficiency.
The system uses OpenCV for image processing and capture, YOLOv8 for object detection, TensorFlow for training and fine-tuning, ONNX Runtime for optimized inference, and PostgreSQL/SQLite for data warehousing. Additional libraries include NumPy, Pandas for data handling, Matplotlib for visualization, and Python ETL scripts; hardware includes GPU servers for training and edge devices (e.g., Jetson Nano, Raspberry Pi) for deployment.
The defect model fine-tunes YOLOv8 with TensorFlow for detecting cracks, dents, and scratches, trained on augmented datasets (10,000+ images) with 80/10/10 splits, IoU/classification loss, and hyperparameters like 0.001 LR, 50 epochs. Features include bounding boxes from auto-annotation (pre-trained YOLO + OpenCV edges), augmentations (rotation, brightness, noise), and custom layers for small defects. Evaluation yields 96% mAP, with ONNX quantization reducing size to 8MB and speed to 40ms/frame.
Data processing extracts images from sources (e.g., client samples, public datasets) using ETL scripts, transforms with auto-annotation (YOLOv8 bounding boxes, OpenCV thresholding), augmentation (Albumentations via code), cleaning (duplicates removal), and loads into PostgreSQL warehouse with schemas for images, annotations, and metadata. Orchestrated via Python cron/Airflow simulations, ensuring quality, querying (e.g., defect trends), and scalability for large datasets, expanding from 2,000 raw to 10,000 processed images.
Testing includes unit validation for annotation and model functions, integration checks for ETL-to-dashboard flow, performance tuning for 96% accuracy and 40ms inference, and simulation tests for lighting variations. Deployment exports ONNX to edge devices (Jetson/Raspberry Pi), integrates OpenCV camera feeds with dashboard (Streamlit/Flask), uses blue-green for cutover, with rollback via model versioning and warehousing queries for validation.
Post-deployment, monitor inference speed and accuracy via dashboard logs, model drift with periodic retraining on new data, and warehouse queries for defect trends, aiming for >95% uptime and <50ms latency. Maintenance includes quarterly updates for augmentations/optimizations, monthly security patches, and cost controls, with alerts for high defect rates to trigger interventions.